RFC - bootscript error reporting

Bill's LFS Login lfsbill at nospam.dot
Fri Jan 30 08:07:12 PST 2004


On Fri, 30 Jan 2004, Ian Molton wrote:

> On Thu, 29 Jan 2004 20:46:43 -0500 (EST)
> Bill's LFS Login <lfsbill at nospam.dot> wrote:
>
> > a partially re-written NIC driver I maintain for
> > myself
>
> why not submit the coe back ?

Warning, long-winded more-than-you-ever-wanted-to-know answer typical
of me.

Combinations of my need, personal shame, time and uncertainty.

A little background: was given an IBM Etherjet with both RJ-45 and BNC
connections and (looked to me like) support for both. The chip is a
CS8920 (or 40, I'm not sure ATM) and has the common config storage,
needs DOS setup diskette, the usual sort of stuff. My LAN was coax,
leftover from parts accumulated way back when Artisoft Lantastic was
cheap and available. Ethernet was expensive.

So my first need was for it to work with coax. I had just done my first
LFS, 3.3 or 3.2, and kernel was 2.4.17. The driver would not work in my
env. It refused to recognize the coax connection and kept going for
RJ-45 *regardless* of parameters passed in modules.conf.

So I jumped into the code. I had not written a device driver since about
1986 or so and that was for true UNIX SYS-V with KSDK and all the
goodies available. Had never even looked inside Linux stuff although I
had used Linux for other things for a couple of years.

Here comes the shame part. I jumped into the driver and did the first
shameful thing: I cursed the developers and maintainers of that code
for delivering some of the least well-structured, well-thought-out and
least robust code I had ever seen. I completely forgot about the manner
in which these things evolve.

Anyway, downloaded the pdfs for the chipset, did a little studying,
located the problem areas, noted all the convoluted logic leading up to
the problem area and developed my strategy. It consisted of re-writing
everything leading up to and including the offending areas to have
*proper* operation (meaning it would use the facilities on-board
properly), have more meaningful data names, substantially more
commentary to aid those that might follow (which appears to be a Bad
Thing (TM) in current environments AFAICT) and had intentions to
contribute. Here comes shame number 2.

I was goal driven to "get it working" so I could move on and did not
take the time, as is my usual procedure, to understand and learn proper
usage of the system calls for communicating with the kernel and doing
modular driver stuff. I just saw what was used, that it worked and
applied it as seemed appropriate. I am ashamed of that.

Consequently, even though I feel it is much better code than what I
started with, I am not comfortable with passing it along. It may be
nothing more than a different version of junk than what was already
there (although I really don't think so). And that uncertainty is a
strong brake for me. If I am to contribute, I expect to deliver stuff I
am certain is better or more useful in some way.

Finally, time. Once I got it working I wanted to focus on LFS and other
things in anticipation of being prepared for the next employment
opportunity (which has not materialized). So I did not invest the time
and effort to finish re-writing the rest of the module, learn the
details of the kernel calls and facilities, etc.

So as I have progressed through upgrades, I diff sources, add in more
stupidity from the later contributors to ensure that if I ever do
contribute my code will handle whatever fix/functionality the
contributor thought they were adding (usually just hard-coding some
value because they don't have a good understanding of either the chip
operations/capabilities or the particular card may not have some
capability, e.g. there are different wiring options that affect what a
certain signal line indicates).

Now I have both both coax and twisted pair network and can test my card
for both situations if I ever do finish the re-write. Have hopes of
someday completing the re-write and learning the proper use of the
kernel facilities and then would love to submit it and see what happens.

If my background and personal assessment is any good, I'm quite good at
this stuff, a little too anal about quality (for today's environment)
and I could even feel comfortable at offering to be the maintainer for
it.

But none of that will happen until I feel I have less to contribute to
LFS in areas that others shun. Or I get a job. Since I do enjoy the
design, dev, implement of this sort of stuff, I do hope to be able to
get back to the fun stuff one day.

BTW, 8139too is next on my hit list - getting tired of the "too many
interrupts..." that seem to be caused (on *very* shallow examination) by
code similar to this

  while(a <= 0) {
    /* do some stuff */
    a--;
  }
  if(a < 0) irritate_the_shit_out_of_bill_with_incorrect_messages();

My sneaky hunch is that it is an uninitialized var somewhere because the
problem is *usually* fixed on the first ifconfig down, rmmod 8139too,
ifconfig... eth0. And everything works well for days or weeks. Until I
reboot again and go through the process again.

-- 
NOTE: I'm on a new ISP, if I'm in your address book ...
Bill Maltby
lfsbillATearthlinkDOTnet
Fix line above & use it to mail me direct.



More information about the lfs-dev mailing list