i18n: summary

Alexander E. Patrakov see at the.sig
Thu Jul 8 07:06:51 PDT 2004

(sorry for the delay, I was out of the city and could not read my mail)

The LFS book has the following things that are related to i18n:

1) the "console" script replaced the traditional "loadkeys" one. The
difference is that the "console" script allows to set a screen font.
2) setting LC_ALL and LANG in /etc/profile
3) no locale-specific patches

These steps are sufficient to get a fully usable system in Europe and
Russia. In fact, this level of i18n is _complete_ in locales that
utilize single-byte character sets and left-to-right writing order, for
packages that use gettext. The following enhancements could still be
made in LFS:

1) The warning that LFS should not be used in other locales is currently
on glibc page in chapter 6. That's too far. I'd rather split it into two
sentences. The first ("LFS supports only locales that utilize...")
should go in Chapter 1, the second ("we install fa_IR.UTF-8 for gettext
tests to pass") should stay in chapter 6.

2) Support for other locales, including UTF-8 based ones. This involves
numerous not very stable patches, different ncurses build procedure, and
still doesn't reach the goal. Even the most UTFy distribution, Fedora
Core 2, has unforgivable multibyte-related bugs. In short, this is not
good even for BLFS. In fact, I dropped any support for my UTF-8 hint 
because I no longer use UTF-8.

3) Change the wording on the Man page. Was:

> Note that you should use "latin1" even if it is not the character set
> of your locale. The reason is that, according to the specification, 
> groff has no means of typesetting characters outside ISO-8859-1 
> without some strange escape codes, and localized manual pages are 
> therefore really a hack. When formatting manual pages, groff thinks 
> that they are in the ISO-8859-1 encoding and this -Tlatin1 switch 
> tells groff to use the same encoding for output. Since groff does no 
> recoding of input characters, the formatted result is really in the 
> same encoding as input (although groff doesn't know that it is not 
> ISO-8859-1) and therefore it is usable as the input for a pager.

Should be:

Note that manual pages for languages that use encodings different from 
ISO-8859-1, are not a valid groff input. The reason is that, according 
to the specification, groff has no means of typesetting characters 
outside ISO-8859-1 without some strange escape codes (that are not used 
anyway in such manual pages). A program that accepts other input 
encodings cannot be called "groff". Localized manual pages are really a 
hack, but a hack that is too widespread to be ignored by us. Here we ask 
groff not to recode its input, by using the -Tlatin1 switch (the output 
is really NOT in the latin1 encoding, but we have to lie to groff 
because it always thinks that its input is in ISO-8859-1). Since in this 
configuration groff doesn't convert encodings (but acts as a 
pass-through filter for printable characters), its output is usable as 
the input for a pager, and therefore one can view localized manual pages.

4) There are some pointers to i18n-related hints, but in fact NO i18n 
related hints exist (except UTF-8, but this hint is bad, and a Chinese 
hint that I cannot find right now and that deals mainly with XIM). The 
pointers should be removed.

5) We might consider replacing groff-1.19.1 with groff-1.18.1 with 
Debian patch that adds some non-standard-conformant devices like 
"ascii8" that help avoiding the above-mentioned hack. All good distros 
use patched groff-1.18.1.

6) We currently don't install localized manual pages that come with 
"man" package.

7) In BLFS, one could internationalize boot scripts.

8) There is still an open bug to add i18n to Mozilla instructions in BLFS

Alexander E. Patrakov
To get my address: echo '0!42!+/6 at 5-3.535.25' | tr \!-: a-z | tr n .

More information about the lfs-dev mailing list