RFC - bootscript error reporting

Jeremy Utley jeremy at linuxfromscratch.org
Thu Jan 29 00:25:28 PST 2004


On Wed, 2004-01-28 at 09:25, IvanK. wrote:
> On Wednesday 28 January 2004 02:23 am, Jeremy Utley wrote:
> > On Tue, 2004-01-27 at 20:58, IvanK. wrote:
> > > Jeremy,
> > >
> > > this is exactly what I have been doing, replacing the read with a
> > > echo "Waiting 5 seconds before continuing..."
> > > sleep 5
> >
> > That's the idea - pause so the user can see the error if they are at the
> > console...but still proceed if the machine is running unattended.
> >
> > > But I would suggest a different approach since we're discussing.  If
> > > fixing, let's make it right.
> > >
> > > How about we actually pass the return value to print_error_msg inside rc,
> > > and maybe the script that rc was processing when the error occured.  We
> > > can assign different levels of importance to each script.  Something like
> > > IMPORTANCE={0-2} where 0 is non-crucial, don't even report it, 1 is
> > > important, report but continue and 2 critical, stop.   Maybe also a 3
> > > that is fatal, shutdown and write a file to / something like /fatal so
> > > that if a reboot is attempted the first thing rc does is check if /fatal
> > > is present and refuse to continue.  I'm thinking a corrupted fs would
> > > warrant such an action.
> >
> > While a nice idea...I'm not sure I'd like the job of determining the
> > importance level.  I still think pause the process, and let the user
> > make the determination themselves on how serious the situation is.
> >
> 
> from fsck's man page:
> 
> The exit code returned by fsck is the sum of the following conditions:
>             0    - No errors
>             1    - File system errors corrected
>             2    - System should be rebooted
>             4    - File system errors left uncorrected
>             8    - Operational error
>             16   - Usage or syntax error
>             32   - Fsck canceled by user request
>             128  - Shared library error
> The exit code returned when multiple file systems are  checked  is  the
> bit-wise OR of the exit codes for each file system that is checked.
> 
> This means that can easily identify the type of error fsck encountered/how it 
> reacted and then decide how to proceed, based upon passing that exit code to 
> rc.
> 
> To me only two circumstances qualify as critical (or fatal, your choice of 
> words) enough to require a reboot or a halt: fs unrecoverable error or 
> network failure when we have a network-mounted fs.  Everything else can be 
> fixed either by dropping to bash or by continuing into your regular runlevel 
> and fixing from there.

OK, I'll be the first to admit that my bitwise-math isn't up to par, but
as I understand it, we could check for the presence of the "4" bit to
see if fsck found errors that could NOT be corrected.  Anything else
that we should check for that would warrant a "fatal" error?

And, as for how to handle a situation like that...is the best solution
to dump to a bash prompt with the filesystem in read-only mode, or is
the best solution to warn the user, halt the system, and let them boot
back up with an init=/bin/bash, or perhaps a rescue CD?

> 
> 
> > > So then, back to print_error_msg, we pass to it the name of the script
> > > and the return value and we decide based on the script's importance level
> > > what to do.
> > >
> > > Just some crazy ideas, I know but hey, you asked :-)
> > >
> > > So yes, this is a good thing (tm) that we're discussing bootscripts.
> > >
> > > Oh, and since we are discussing, where do we stand on  static vs dhcp?  I
> > > remember there was a long thread, but at some point I got distracted and
> > > didn't see the end of it.  If there's an interest we should address this
> > > as well and I can offer my approach.  I'm sure there'll be a better one.
> >
> > The core bootscripts package will not support DHCP out of the box...but
> > the plan is as Nathan said - make them more modular and extensible so
> > the BLFS guys can add in DHCP without completely rewriting the network
> > script.
> 
> Agreed.  I guess I didn't express myself eloquently enough -- my idea is to 
> include the hooks in the current network scripts to support a "pluggable" 
> static|dhcp model.
> 
> 
> And now the new fad -- rhgb (redhat graphical boot, at least I think that's 
> what the abreviation stands for) that's in Fedore Core.
> 
> I know, I know this has *nothing* to do with lfs and is strictly in the domain 
> of blfs, but if there's enough interest, can we discuss putting in place one 
> more function in functions and two lines in rc, please?

I'm not familiar with rhgb - is it something that scrolls the kernel
output up the screen in graphical mode, or is something more like
{g,x,k}dm, allowing a graphical logon?

However, I can definateively say that *IF* theres enough interest in it,
just about anything is fair game in the bootscripts rewrite.  More on
this later!

-J-

> 
> Thanks,
> IvanK.
> >
> > Jeremy




More information about the lfs-dev mailing list