RFC - bootscript error reporting

James Robertson jwrober at linuxfromscratch.org
Wed Jan 28 15:11:09 PST 2004


I generally agree with Ivan's approach. See below as usual.

IvanK. wrote:
> Jeremy,
> 
> this is exactly what I have been doing, replacing the read with a
> echo "Waiting 5 seconds before continuing..."
> sleep 5
> 
> But I would suggest a different approach since we're discussing.  If fixing, 
> let's make it right.
> 
> How about we actually pass the return value to print_error_msg inside rc, and 
> maybe the script that rc was processing when the error occured.  We can 
> assign different levels of importance to each script.  Something like 
> IMPORTANCE={0-2} where 0 is non-crucial, don't even report it, 1 is 
> important, report but continue and 2 critical, stop.   Maybe also a 3 that is 
> fatal, shutdown and write a file to / something like /fatal so that if a 
> reboot is attempted the first thing rc does is check if /fatal is present and 
> refuse to continue.  I'm thinking a corrupted fs would warrant such an 
> action.
> 
> So then, back to print_error_msg, we pass to it the name of the script and the 
> return value and we decide based on the script's importance level what to do.

I really like this.  As others have posted, not all errors are the same. 
  We would need to come up with some kind of system to trap errors from 
the different programs that are called within boot scritps.  Or if that 
is too hard, then at least some system to agree on a "criticality" of 
certain Sxx scripts.  Kxx scritps are important, but not as critical IMO 
as Sxx scripts when entering a certain rc level.

Also, there needs to be a mechanism in place (in the bootscritps) to 
notify the administrator when he/she came back to the console that a 
reboot occured at such-and-such a date and time and what the result of 
any messages were.  This is especially important for the ones that get a 
timeout value and move on.  Can we use mail for that?  I don't know, I 
am not _that_ smart with Linux yet.

On the simple pause idea, the time to pause needs to be easily 
changeable in a /etc/sysconfig file of some kind.  Everyone will have a 
different opinion as to how long they want the boot process to pause 
before continuing.

<opinion>
I am also not sure that LFS needs to be concerned with headless or 
non-administrator-local machines.  I would love it in my production 
environment, but most of our readers reboot at the console.  The issues 
Jeremy brought up are more about that.  The easiest thing is simply to 
fix the scritps to support the enter key or change the text to say CRTL-J.

I am only throwing this out for thought.  I actually would love to see 
Jeremy's idea put into the scripts.
</opinion>

> 
> Just some crazy ideas, I know but hey, you asked :-)
> 
> So yes, this is a good thing (tm) that we're discussing bootscripts.
> 
> Oh, and since we are discussing, where do we stand on  static vs dhcp?  I 
> remember there was a long thread, but at some point I got distracted and 
> didn't see the end of it.  If there's an interest we should address this as 
> well and I can offer my approach.  I'm sure there'll be a better one.
> 
> IvanK.
> 
> On Tuesday 27 January 2004 06:17 pm, Jeremy Utley wrote:
>>As the new co-maintainer of the lfs-bootscripts package, I'd like to get
>>the community's input on what I feel is a fairly serious problem with
>>the LFS-bootscripts - that is the hanging of the bootscripts when an
>>error is encountered.  We've all seen it before:
>>
>>You should not be seeing this message! blah blah
>>
>>Press Enter to continue.
>>
>>This poses 2 problems - first, if the machine is unattended, this will
>>hang the reboot process.  Second, previous reports list that
>>occasionally hitting the enter key doesn't do the RightThing (TM) with
>>regard to this.  I propose to make the default bootscripts replace this
>>with a pause of a reasonable duration (5-10 secs), and then continue on
>>with the process.  The worst case if this is done that *I* personally
>>can think of is that some filesystems may not be properly unmounted on
>>the reboot, and might need to be fscked after the reboot.  It still,
>>however, gives the user who's watching the bootscripts proceed to see
>>the error that occured, and look into the problem when the system comes
>>back up.
>>
>>Anyone see any major flaws with doing this?

James

-- 
James Robertson -- jwrober at linuxfromscratch dot org
Reg. Linux User -- #160424 -- http://counter.li.org
Reg. LFS User   -- #6981   -- http://www.linuxfromscratch.org
LFS Bugzilla Maintainer    -- http://{blfs-}bugs.linuxfromscratch.org



More information about the lfs-dev mailing list