Pushing UTF-8 support into LFS
Alexander E. Patrakov
patrakov at ums.usu.ru
Sat Aug 6 21:59:06 PDT 2005
a sample LFS-like system that supports UTF-8 is available on a live CD.
So, it may be a good idea to create an experimental branch of the LFS
book that incorporates the same changes. LFS built according to that
branch should work in both UTF-8 and traitional locales. So, patches
that make things work in UTF-8 but break the non-UTF-8 case are a no-go.
Summary of changes (including those that are on the official non-UTF-8
CD) is below. Details for each package (and screenshots that illustrate
the problems) will be available on request.
sharutils: added to chapter 5 because the ncurses rollup shell script
Ncurses: upgraded to 20050319 version (or at least applied the
-altcharset-1 patch), built with --enable-widec. Compatibility linker
scripts are created so that apps that want -lncurses are actually linked
LFS Bootscripts: the "console" script is rewritten.
sysklogd: the logic that treats bytes 0x80-0x9f as unprintable
characters should be disabled, a patch is available.
coreutils: big patch from RedHat. Unfortunately, with bad bug history.
gawk: either a big patch from RedHat or a beta version (but it fails one
test in its testsuite). When gawk-3.1.5 is released, no patches will be
needed. Expect more bugs to show up in dfa.c.
grep: big patch from RedHat. Expect more bugs to show up in dfa.c.
GNU Groff-1.19.1: replaced with Debian Groff 126.96.36.199-8
gdbm: added to LFS as a dependency of man-db
man: replaced with man-db
diffutils: patch from RedHat
linux: a patch is necessary for dead keys to work in UTF-8 mode.
LOW PRIORITY LFS:
glibc: the CD uses a patch that alters the list of supported locales.
no_NO and vi_VN.TCVN removals are bugfixes, the rest of the patch is a
cosmetic tweak. libidn is nice too but also optional.
kbd: a patch is available that fixes all known keymaps that have
backspace/delete problem, so that KEYMAP_CORRECTIONS are rarely needed.
texinfo: a minor patch exists that forces a fallback to English
interface in multibyte locales.
readline: almost works as-is. RedHat also applies patches for the
wrapping problem and for segfault in lftp.
vim: some of the upstream patches fix problems in multibyte locales. I
applied all upstream fixed on the CD. Also it is necessary to remove
translated non-ISO-8859-1 tutorials because they are unreadable in UTF-8
BLFS PACKAGES ON THE CD:
cdrtools: a patch for mkisofs is needed in order to create
Windows-readable CDs. Also the name of the author is transliterated so
that non-ISO-8859-1 users can read it.
thunderbird, firefox: a patch is available that works around the problem
with displaying dates in "expired certificate" dialog and with gpg
messages in Enigmail. Needed for all non-ISO-8859-1 locales.
xfce: one Chinese message is mis-marked as Russian. A patch is available
and accepted upstream.
Xorg: there's a patch that makes Xorg understand more glibc locale names.
nALFS: a patch is necessary to in order to display line drawing
characters on Linux console properly.
GPM: built --without-curses, if you want mouse support on linux console
please build ncurses --with-gpm instead.
OTHER TASKS FOR BLFS:
mark broken packages that should not be installed on a system that
BUGS ON THE CD:
Alexander E. Patrakov
More information about the lfs-dev