RFC - bootscript error reporting

Bill's LFS Login lfsbill at nospam.dot
Wed Jan 28 05:27:35 PST 2004


On Tue, 27 Jan 2004, IvanK. wrote:

> Jeremy,
>
> this is exactly what I have been doing, replacing the read with a
> echo "Waiting 5 seconds before continuing..."
> sleep 5
>
>
> But I would suggest a different approach since we're discussing.  If fixing,
> let's make it right.
>
> How about we actually pass the return value to print_error_msg inside rc, and
> maybe the script that rc was processing when the error occured.  We can
> assign different levels of importance to each script.  Something like
> IMPORTANCE={0-2} where 0 is non-crucial, don't even report it, 1 is
> important, report but continue and 2 critical, stop.   Maybe also a 3 that is
> fatal, shutdown and write a file to / something like /fatal so that if a
> reboot is attempted the first thing rc does is check if /fatal is present and
> refuse to continue.  I'm thinking a corrupted fs would warrant such an
> action.

Oops! You said "do it right". That may be all opinion, but I'll add my
two-bits ($0.25) here). My thought include shutdown and boot up.

First, *never* attempt to right anything to a *corrupted* file system.
I presume no need to discuss that?

Second, some errors related to fscking the FS are a little more complex
to distinguish (as to which FS had what error - threads exist on this)
and I don't know if there is a reasonable solution to that.

Third, to address concerns for both attended processes, where an admin
should be able to control continuation or not, and unattended, I suggest
the following.

When an error is detected, enter a function that that sets a trap with a
time out (man bash, "TMOUT" and "SHELL BUILTIN COMMANDS", "trap" and
many refs to "trap" other places). Also, maybe change the prompt to hit
<CTL-J> instead of enter (or figure out what needs to be changed to get
<ENTER> to work correctly, likely an stty command is needed). When the
error routine is entered, the trap is set to timeout after a number of
seconds configured by the book's install instructions (e.g. they change
a value in the script or do something like the below

  UsrVal=30 cat >> /etc/rc.d/init.d/test << EOF  # Notice no quoting
      DefTmOut=$UsrVal
      .
      .
      .
  EOF

when an error is seen, the prompt is displayed and if no key is within
the timeout period, the trap fires, signals are reset and the process
continues doing whatever.

>
> So then, back to print_error_msg, we pass to it the name of the script and the
> return value and we decide based on the script's importance level what to do.

Also a good idea.

><snip>

-- 
NOTE: I'm on a new ISP, if I'm in your address book ...
Bill Maltby
lfsbillATearthlinkDOTnet
Fix line above & use it to mail me direct.



More information about the lfs-dev mailing list