New personal experimental book [warning: lots of UTF-8 in this]

DJ Lucas dj at linuxfromscratch.org
Tue Sep 30 19:03:00 PDT 2008


Ken Moffat wrote:
> On Tue, Sep 30, 2008 at 04:54:33PM +0100, Colin Watson wrote:
>   
>> On Tue, Sep 30, 2008 at 03:48:50PM +0100, Ken Moffat wrote:
>>     
>>> On Tue, Sep 30, 2008 at 12:27:09PM +0100, Colin Watson wrote:
>>>       
>>>> Modern versions of man-db default to expecting UTF-8 for manual 
>>>> page source (although if they realise that the page is actually 
>>>> encoded in a legacy encoding then they'll automatically fall back 
>>>> to that), and will generate whatever is appropriate for the user's 
>>>> locale.
>>>  I like the sound of that.  It's not the way we've been doing things
>>> since we switched to man-db in LFS, and we have text (perhaps
>>> carried forward in error) saying that man-db can't display UTF-8.
>>> See
>>> http://www.linuxfromscratch.org/lfs/view/development/chapter06/man-db.html
>>>  part 6.45.2 in the middle of the page.
>>>       
>> Ah, that definitely used to be true but is false as of man-db 2.5.0
>> (though you should really use at least 2.5.1 - 2.5.0 didn't get the
>> encoding fallback logic quite right).
>>
>> Since I have the opportunity (and thanks, I hadn't seen that page
>> before), it seems worth going through the rest of that page. If I should
>> file these as bugs instead, let me know, or feel free to forward this to
>> lfs-dev, or whatever.
>>     
>  I suppose one of _us_ ought to file it, once this hits the list
> archives.
>   
This CC to lfs-dev is sufficient, but if you or anyone else wants to 
correct the text, please file away.  :-)
>>   The first change is a sed substitution to delete the “/usr/man” and
>>   “/usr/local/man” lines in the man_db.conf file to prevent redundant
>>   results when using programs such as whatis:
>>
>> Do you make /usr/man and /usr/local/man symlinks? If so, I could detect
>> that and skip them automatically.
>>     
>>   The second change accounts for programs that Man-DB should be able to
>>   find at runtime, but that haven't been installed yet:
>>
>> I made configure options available for these in 2.5.0, so you could use
>> '--with-browser=lynx --with-col=col --with-vgrind=vgrind
>> --with-grap=grap' instead.
>>
>>   Prepare Man-DB for compilation:
>>
>> I think I already suggested this to somebody else at LFS, but I'd
>> recommend that you use --with-db=gdbm rather than the default of
>> Berkeley DB (which is something of an awkward beast, and overkill for
>> man-db). This will be the default in man-db 2.5.3.
>>     
Other packages in the base LFS utilize BDB.  They may or may not work 
with GDBM so I'll be looking into that as soon as we get updated to 
reasonable revisions of all installed 'base' software.  My question, 
however, will man-db-2.5.3 allow continued used of BDB in the near future?
>> And, yes, I think you can get rid of the convert-mans business entirely.
>> With the exception of a few hopelessly misencoded pages that are really
>> lost causes, man-db can pretty much cope with any of the obvious
>> candidates for encoding pages in each language now.
>>     
This is very nice!
>> I noticed a comment in there about Norwegian not working, and have fixed
>> it for man-db 2.5.3.
>>
>>     
>>>> In the distributions I'm most directly involved with, namely Debian and
>>>> Ubuntu, everything is set up for UTF-8 output by default, and we've
>>>> arranged for the packaging tools to automatically convert pages to UTF-8
>>>> on installation with the aid of some helper tools I ship with man-db;
>>>>         
These will also be very useful in BLFS.
>>>> while this latter item has only been running for a few months, it won't
>>>> be long until we'll be running with UTF-8 across the board. As soon as
>>>> groff upstream finishes off Unicode support then we'll use that and the
>>>> whole pipeline will be UTF-8, but for the meantime we recode back and
>>>> forward behind the scenes and very few people have to notice or care.
>>>>         
>>>  I'll also take a look at this part, it sounds good.  I hope you're not
>>> holding your breath for a UTF-8-capable version of groff ;-)
>>>       
>> Oh, certainly not; I've put a lot of effort into not holding my breath
>> for that! That said, I'd be entirely happy to make man-db able to use
>> groff-utf8 as an option if that's what you guys would prefer.
>>
>>     
>
>  I haven't yet looked at what you are doing in 2.5.2, or what
> versions of groff you are using in ubuntu and debian, but I'm fairly
> sure most LFS users won't want to use groff-utf8 if it isn't needed.
> It's only a temporary hack until groff is fixed.
>
>   
Definitely not.  It looks like a sensible, long-term solution will be 
here soon.  If man-db is already doing the leg work for the interim 
solution, then we have a much larger development team (Debian) to follow 
for guidance.  We'd be much better off without the wrapper for groff.
>>>> Is there some misunderstanding here about what man-db is doing? If so,
>>>> I'd be happy to explain.
>>>>         
>>>  Thanks for the offer, I might take you up on it in a few weeks.  NB
>>> my estimates for how long things will take me are always way out, so
>>> that might be next year!  Depends on how long I spend beating my
>>> head against the various versions of mozilla on ppc64, plus whatever
>>> goes wrong when I finally upgrade my desktop to current packages :-(
>>>       
>> I know how it is, don't worry. Building distributions is busy work (in
>> both senses) ...
>>
>> Cheers,
>>
>> -- 
>> Colin Watson 
>>     
>
>  thanks, Colin.
>
> ĸen
>   
My real concern is the version of groff being used.  I did not see 
mention of a current groff version which was *my* original concern.  I 
want to use what works, but I also want to stay as close to upstream as 
possible for all packages because we (LFS) do not have the development 
staff that distributions have.  Keep in mind that LFS is an educational 
product, not a 'distribution', though many use it as their 
'distribution' of choice.  Utilizing Debian's work in this area was 
great (and will continue to be I think).  It allowed Alexander to 
provide a working setup for almost all cases, and explain in detail the 
future issues (though the current text, like much of the book ATM, is 
now out of date).

-- DJ Lucas

-- 
This message has been scanned for viruses and
dangerous content, and is believed to be clean.




More information about the lfs-dev mailing list