RFC - bootscript error reporting

IvanK. ivan at chepati.org
Wed Jan 28 09:25:37 PST 2004


On Wednesday 28 January 2004 02:23 am, Jeremy Utley wrote:
> On Tue, 2004-01-27 at 20:58, IvanK. wrote:
> > Jeremy,
> >
> > this is exactly what I have been doing, replacing the read with a
> > echo "Waiting 5 seconds before continuing..."
> > sleep 5
>
> That's the idea - pause so the user can see the error if they are at the
> console...but still proceed if the machine is running unattended.
>
> > But I would suggest a different approach since we're discussing.  If
> > fixing, let's make it right.
> >
> > How about we actually pass the return value to print_error_msg inside rc,
> > and maybe the script that rc was processing when the error occured.  We
> > can assign different levels of importance to each script.  Something like
> > IMPORTANCE={0-2} where 0 is non-crucial, don't even report it, 1 is
> > important, report but continue and 2 critical, stop.   Maybe also a 3
> > that is fatal, shutdown and write a file to / something like /fatal so
> > that if a reboot is attempted the first thing rc does is check if /fatal
> > is present and refuse to continue.  I'm thinking a corrupted fs would
> > warrant such an action.
>
> While a nice idea...I'm not sure I'd like the job of determining the
> importance level.  I still think pause the process, and let the user
> make the determination themselves on how serious the situation is.
>

from fsck's man page:

The exit code returned by fsck is the sum of the following conditions:
            0    - No errors
            1    - File system errors corrected
            2    - System should be rebooted
            4    - File system errors left uncorrected
            8    - Operational error
            16   - Usage or syntax error
            32   - Fsck canceled by user request
            128  - Shared library error
The exit code returned when multiple file systems are  checked  is  the
bit-wise OR of the exit codes for each file system that is checked.

This means that can easily identify the type of error fsck encountered/how it 
reacted and then decide how to proceed, based upon passing that exit code to 
rc.

To me only two circumstances qualify as critical (or fatal, your choice of 
words) enough to require a reboot or a halt: fs unrecoverable error or 
network failure when we have a network-mounted fs.  Everything else can be 
fixed either by dropping to bash or by continuing into your regular runlevel 
and fixing from there.


> > So then, back to print_error_msg, we pass to it the name of the script
> > and the return value and we decide based on the script's importance level
> > what to do.
> >
> > Just some crazy ideas, I know but hey, you asked :-)
> >
> > So yes, this is a good thing (tm) that we're discussing bootscripts.
> >
> > Oh, and since we are discussing, where do we stand on  static vs dhcp?  I
> > remember there was a long thread, but at some point I got distracted and
> > didn't see the end of it.  If there's an interest we should address this
> > as well and I can offer my approach.  I'm sure there'll be a better one.
>
> The core bootscripts package will not support DHCP out of the box...but
> the plan is as Nathan said - make them more modular and extensible so
> the BLFS guys can add in DHCP without completely rewriting the network
> script.

Agreed.  I guess I didn't express myself eloquently enough -- my idea is to 
include the hooks in the current network scripts to support a "pluggable" 
static|dhcp model.


And now the new fad -- rhgb (redhat graphical boot, at least I think that's 
what the abreviation stands for) that's in Fedore Core.

I know, I know this has *nothing* to do with lfs and is strictly in the domain 
of blfs, but if there's enough interest, can we discuss putting in place one 
more function in functions and two lines in rc, please?

Thanks,
IvanK.
>
> Jeremy




More information about the lfs-dev mailing list