Please review for Man-DB changes

Alexander E. Patrakov patrakov at gmail.com
Sat Oct 25 00:55:24 PDT 2008


DJ Lucas wrote:
> DJ Lucas wrote:
>> Thank you again for the detailed critique, suggestions and examples.  
>> You've been a great help.  I'll have another go at it using your text above.
>>
>>   
> OK, I think this is almost the final...
> 
> http://www.linuxfromscratch.org/~dj/LFS-MANDB/chapter06/man-db.html


> Some packages provide non-English manual pages. They are displayed
> correctly only if their location and encoding matches the expectation
> of the "man" program. However, different Linux distributions have
> different policies (expressed in the choice of the man program, its
> configuration and patches applied to it) concerning the character
> encoding in which manual pages are stored in the filesystem.
> 
> E.g., Debian previously required Russian manual pages to be encoded
> in KOI8-R and to be placed in /usr/share/man/ru. Now, in addition,
> their man program (Man-DB) searches for UTF-8 encoded Russian manual
> pages in /usr/share/man/ru.UTF-8. On the other hand, Fedora uses
> UTF-8 encoded manual pages exclusively. Russian manual pages are
> found in /usr/share/man/ru and their man program doesn't acknowledge
> /usr/share/man/ru.UTF-8. Many other distributions ignore the problem
> completely, leaving the end user with a mix of readable and
> unreadable manual pages, and even worse yet, unreadable error
> messages when a suitable manual page is not found.

"ignore the problem" => which problem? The text suggests that many 
distributions ignore that fact that different distributions have 
different policies. Some other word is needed. Maybe: "Many other 
distributions ignore the need for a consistent policy, leaving the user 
with ..."?

"a mix of readable and unreadable manual pages" - yes, very well 
spotted, better than I formulated on this list! However, there is a very 
low-priority wish: some people will misinterpret the word "unreadable" 
as "no way to make the man program access this file" instead of "man 
reads this file and displays garbage". Here a picture would be worth 
thousand words, but pictures are not in the current LFS tradition.

"and, even worse yet, unreadable error messages" => no, unreadable pages 
are worse. And this situation follows from a bug in the "man" program 
(it uses the obsolete catgets interface instead of gettext), not from 
misplaced or misencoded manual pages, so let's not mention it.

> Disagreement about the expected encoding of manual pages amongst
> distribution vendors, has led to confusion for upstream package
> maintainers. One package may contain UTF-8 manual pages, while
> another ships with manual pages in legacy encodings. Man-DB uses a
> built-in table (see below) to find the correct serach directory for
> manual pages based on the user's locale settings.

No, it doesn't look into the table in this case. See add_nls_manpath() 
in http://www.chiark.greenend.org.uk/~cjwatson/bzr/man-db/trunk/src/manp.c

It iterates over all subdirectories and tests whether the subdirectory 
is for the user's language, completely disregarding the encoding. IOW, 
all of /usr/share/man/ru{,.KOI8-R,.CP1251,.UTF-8} are searched in all of 
ru_RU.KOI8-R, ru_RU.CP1251 (unofficial, has to be localadef'ed manually) 
and ru_RU.UTF-8 locales.

The rest is OK.

> ...I have a couple of questions:

<snip already answered questions>

> Was this an LFS 
> only problem in that we didn't pass '+lang none' to Man's build?

It is a problem for all distributions that don't pass '+lang none'. No 
distribution known to me passes '+lang none'. Fedora converts error 
messages so that they look right in UTF-8 locales, but this makes them 
incorrect in legacy locales.

> Also, I think the one line paragraph above the table can be removed 
> completely since the table is explained in the paragraph above that, but 
> I'm not sure.

Yes, remove it.

-- 
Alexander E. Patrakov



More information about the lfs-dev mailing list