[Keystone Slip #38] What to do with kernel headers

Gerard Beekmans gerard at linuxfromscratch.org
Wed Mar 21 18:35:45 PST 2001


> History:
> 03/21/2001 21:21 by simon:
> Status changed to Closed
> I think the discussion on lfs-discuss on that matter has ended, and I've
> seen no convincing arguments that would make us change the way we do it.
> Supposedly, we should do so because Linus says so (he also says to use
> egcs, remember?), or because distros do so.
>
> We've always done it with symlinks, and it has always worked like that.
> Using LFS you do a lot of compiling, and I've never seen anything having
> trouble with kernel header version conflicts with glibc. It is a lot less
> trouble with symlinks, and if you're a programmer and you need static
> headers then just do it like you want, you probably know enough about
> headers to know what you want and how to do it.
>
> Unless problems appear, I suggest we continue using symlinks.
>
> "We should do it because others do it" has _absolutely_ no weigh in my
> judgement balance.

I agree with that, but the following Glibc issue did strike me as being 
important:

Re: new kernel install. instructions for book
Date: Thu, 15 Mar 2001 02:04:01 +0100
From: "Matthias Benkmann" <haferfrost at web.de>
To: lfs-discuss at linuxfromscratch.org
Reply to: lfs-discuss at linuxfromscratch.org




On the contrary. If you symlink /usr/src/linux you have to recompile libc 
(and to be safe every program dynamically linked against libc) whenever 
you change a kernel. Only if you keep the old headers do you have a chance 
to survive a kernel update without problems. It's not guaranteed, though. 
A backwards incompatible change as it might occur between 2 major versions 
of a kernel may always require rebuilding libc (and applications that 
depend on the kernel structure that was changed).

The reason why compiling a program with different headers than libc was 
compiled with is a bad thing, is the fact that programs do not call the 
linux kernel directly. They call libc which wraps kernel functionality. 

(NOTE: I am not a kernel hacker so the following is just based on general 
programming knowledge.)

Take a look at /usr/src/linux/include/asm/stat.h

struct stat is the structure returned by the stat() function which returns 
file permissions/inode number/...

My stat.h looks like this:

struct __old_kernel_stat {
         unsigned short st_dev;
         unsigned short st_ino;  /*this is the inode number*/
         unsigned short st_mode;
         unsigned short st_nlink;
         unsigned short st_uid;
         unsigned short st_gid;
         unsigned short st_rdev;
         unsigned long  st_size;
         unsigned long  st_atime;
         unsigned long  st_mtime;
         unsigned long  st_ctime;
};

struct stat {
         unsigned short st_dev;
         unsigned short __pad1;
         unsigned long st_ino; /*this is the inode number*/
         unsigned short st_mode;
         unsigned short st_nlink;
         unsigned short st_uid;
         unsigned short st_gid;
         unsigned short st_rdev;
         unsigned short __pad2;
         unsigned long  st_size;
         unsigned long  st_blksize;
         unsigned long  st_blocks;
         unsigned long  st_atime;
         unsigned long  __unused1;
         unsigned long  st_mtime;
         unsigned long  __unused2;
         unsigned long  st_ctime;
         unsigned long  __unused3;
         unsigned long  __unused4;
         unsigned long  __unused5;
};

I don't know the particulars of struct __old_kernel_stat but let's assume 
for a moment that this was what struct stat looked like for some 0.x linux 
kernel and was changed when going to 1.x. What happens if you switch from 
a kernel that uses  __old_kernel_stat to a kernel that uses the current 
struct stat without recompiling libc? 

Assume you recompile ls after making that change. What happens? 

1st scenario: You kept the old header files that match the ones libc was 
compiled with (i.e. you follow Linus' advice).

ls calls libc's stat() function and libc calls the kernel as it has always 
done. Because the linux developers aren't stupid they did not just change 
the stat syscall to return the new stat structure. They kept the old 
syscall that returns the old stat and added a new syscall for the new 
stat. So libc calls the old syscall and gets the old stat structure which 
it happily returns to ls which interprets it correctly because it was 
compiled with the same header. Everything is fine.

2nd scenario: You replaced the header files so that ls is compiled against 
the new version while libc was compiled against the old version (i.e. you 
acted against Linus' advice)

ls calls libc's stat() function and libc calls the kernel. As mentioned 
above the syscall interface is usually kept backwards compatible, so the 
kernel returns an old struct stat to libc which passes that on to ls. 
However ls thinks it is getting a new struct stat and interprets it 
accordingly. Now look at the above structures. In the old struct stat the 
inode number is an "unsigned short", i.e. a 16bit value. In the new struct 
stat it is an "unsigned long", i.e. a 32bit value (this is for Intel, of 
course). The result is that ls will not only display wrong values; because 
the offsets are shifted and the size of the structure is different, all 
kinds of other bad stuff will happen so that ls will probably segfault. 
Fortunately ls doesn't *write* to disk. Now imagine what happens with a 
program that modifies a file based on the data returned by stat(). VERY 
VERY BAD!!
 
BTW, I have used the symlinks, too, until now. However I have just scared 
myself so that I will reconsider this, at least when changing to a new 
major version of the kernel. The symlinks should be fine if you're only 
upgrading a minor version of the kernel.

--------------

I would say that would be a good reason to go with the static headers in 
/usr/include/linux|asm.

Don't get me wrong I never have had any problems whatsoever with using 
symlinks for the headers (well aside from one little issue where I had to 
recompile net-tools, netkit-base and pppd when I switched from a 2.2 to a 2.4 
kernel because the network tools made Linux completely lock up, but I don't 
know for sure if that was related to changed headers in combination with 
Glibc or because the binaries were linked with an older kernel thus expecting 
old kernel behaviour. I didn't touch Glibc, so it wasn't a Glibc problem from 
the looks of it).

I'm not a programmer to be able to decide whether Matthias' email has merit. 
But I am very inclined to believe him and act accordingly...anybody who would 
agree?

-- 
Gerard Beekmans
www.linuxfromscratch.org

-*- If Linux doesn't have the solution, you have the wrong problem -*-





More information about the lfs-book mailing list