New personal experimental book

Alexander E. Patrakov patrakov at gmail.com
Sat Sep 13 02:01:19 PDT 2008


Lefteris Dimitroulakis wrote:

> > Since you install Man, you get relevant man pages translated in greek.
> > So you may add in your Table 6.1 "Greek (el)   ISO-8859-7".
>
> Additionally:
> Bulgarian (bg) cp1251
> Romanian (ro) ISO-8859-2
> Slovenian (sl)  ISO-8859-2

NAK.

The table in the text is really a copy of a table in Man-DB source, because 
the expectations of Man-DB can't be changed. With Man, the encoding 
expectations depend on NROFF and JNROFF lines. So, you can't really suggest 
this without knowing how DJ Lucas is going to configure Man. Your suggestion 
is obviously valid if one uses the default NROFF line (and thus avoids 
groff-utf8) and a non-UTF-8 locale. However, this is obviously different from 
the expected future direction.

DJ: I will reject everything related to Man(-DB) reconfiguration if it doesn't 
discuss (by means of text, not only commands) the following items:

1) The list of subdirectories of /usr/share/man where a manual page for a 
given language is looked up

Currently, /usr/share/man/ll* (i.e., 
both /usr/share/man/ru, /usr/share/man/ru.KOI8-R and /usr/share/man/ru.UTF-8 
are searched in both ru_RU.KOI8-R, ru_RU.CP1251 and ru_RU.UTF-8 locales), 
and /usr/share/man if nothing is found.

2) for both UTF-8 and non-UTF-8 locales, the encoding at the input and at the 
output of every program that is involved in formatting and displaying the 
manual page;

Yes, I understand that it is more than currently in the book, but only the 
encoding on disk matters now because the processing pipeline is hard-coded in 
Man-DB. Anyway, in the non-CJK case: the on-disk encoding of manual pages is 
inferred from their location (e.g., "/usr/share/man/ru => "KOI8-R" according 
to the table, "/usr/share/man/ru.UTF-8" => "UTF-8" because it is in the 
directory name). Then, if this is not a no-op, Man-DB runs iconv to convert 
to the language-specific 8-bit encoding listed in the table. Then it runs one 
of "groff -Tutf8" (if input is in ISO-8859-1 and the output should be in 
UTF-8), "groff -Tlatin1" (if both input and output are in 
ISO-8859-1), "groff -Tascii8" followed by iconv from the 8-bit charset to the 
user's locale (in all other cases). In the CJK case: the on-disk encoding of 
manual pages is inferred from their location. Then, if this is not a no-op, 
Man-DB runs iconv to convert to the locale encoding, and passes the result of 
conversion through "groff -Tnippon".

3) instructions to reconfigure your system from UTF-8 to non-UTF-8 locale (or 
the other way round) without reinstalling all packages that provide 
translated manual pages.

Currently, no actions related to manual pages are needed. Only 
edit /etc/sysconfig/console and /etc/profile, and convert stuff in your home 
directory.

-- 
Alexander E. Patrakov



More information about the lfs-dev mailing list