Who understands this code?

Donald Smith dss-lfs at cfl.rr.com
Fri Mar 14 13:41:00 PST 2003


Greg Schafer wrote:
> On Fri, Mar 14, 2003 at 04:04:27PM +1100, Greg Schafer wrote:
> 
>>Don, both tests are being done inside a chroot. One fails, one doesn't.
>>One is an early chroot (kernel headers, glibc, binutils + Ch 5 stuff), one
>>is a full system. :-/
>>
>>For the record, my logs dating back to Aug 2002 with gcc-3.1.1 also confirm
>>the problem back then.
> 
> 
> Ok, this HAS to be some sort of glibc bug. Check this out:-
> 
> root:~# ldconfig
> root:~# ./a.out
> test 3 nonconsecutive pages - 40014000, 40149000
> root:~# rm /etc/ld.so.cache
> root:~# ./a.out
> root:~# ldconfig
> root:~# ./a.out
> test 3 nonconsecutive pages - 40014000, 40149000
> 
> So, rm'ing /etc/ld.so.cache fixes it! ?? Maybe the tiny ld.so.cache in early
> phase of chroot is triggering the bug and later on when we have more libs
> and the size of ld.so.cache grows the bug doesn't show up?
> 
> Weird, very weird.. I don't fancy explaining this one to the glibc guys. Any
> takers? :-)
> 
> Strace logs attached.
> 
> Greg

I don't think it's a bug at all, just a timing problem and a brain dead 
test. What the test is trying to determine is whether it can unmap two 
contiguous mmap'ed regions with a single munmap call. Unfortunately due 
to timing of events, the ld.so.cache gets munmap'd right before the test 
  program maps it's first block. Because the ld.so.cache file is loaded 
early, it gets a low address, then some other things get mapped, then 
ld.so.cache gets unmap'd, then the two test maps happen with the first 
grabbing the address just freed by ld.so.cache and the second being way 
beyond the stuff mapped between.

It works with no ld.so.cache because it's not there to get unmapped. It 
works with a large ld.so.cache (i.e. >4096 bytes) because that will free 
2 pages which the 2 mmap's can grab and they'll be contiguous. It does 
not work with a small ld.so.cache (in this case 3102 bytes) because only 
one page gets "held" by the ld.so.cache to be released just in time to 
screw up the test.

The only buggy code is the test. It should continue to mmap memory 
blocks until it gets two contiguous ones, then check the freeing.

> ------------------------------------------------------------------------
> 
> execve("./a.out", ["./a.out"], [/* 8 vars */]) = 0
> uname({sys="Linux", node="tigers-lfs", ...}) = 0
> brk(0)                                  = 0x8049a04
> open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY)      = -1 ENOENT (No such file or directory)

^^^^ NO ld.so.cache file so no mmap.

> open("/lib/i686/mmx/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory)
> stat64("/lib/i686/mmx", 0xbffff160)     = -1 ENOENT (No such file or directory)
> open("/lib/i686/libc.so.6", O_RDONLY)   = -1 ENOENT (No such file or directory)
> stat64("/lib/i686", 0xbffff160)         = -1 ENOENT (No such file or directory)
> open("/lib/mmx/libc.so.6", O_RDONLY)    = -1 ENOENT (No such file or directory)
> stat64("/lib/mmx", 0xbffff160)          = -1 ENOENT (No such file or directory)
> open("/lib/libc.so.6", O_RDONLY)        = 3

^^^^ fallback position - open libc.

> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340[\1"..., 1024) = 1024
> fstat64(3, {st_mode=S_IFREG|0755, st_size=1263608, ...}) = 0
> mmap2(NULL, 1254820, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40014000

^^^^ then map it (it's 1.2 MB so uses lots of address space)

> mprotect(0x4013f000, 30116, PROT_NONE)  = 0
> mmap2(0x4013f000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x12a) = 0x4013f000
> mmap2(0x40144000, 9636, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40144000
> close(3)                                = 0

^^^^ mmap some more things

> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40147000

^^^^ test # 2 (??? didn't really check)

> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40148000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40149000

^^^^ test # 3 - notice the consecutive block addresses a block being 
0x1000 or 4096.

> semget(IPC_PRIVATE, 0, 0)               = -1 ENOSYS (Function not implemented)
> _exit(0)                                = ?
> 
> 
> ------------------------------------------------------------------------
> 
> execve("./a.out", ["./a.out"], [/* 8 vars */]) = 0
> uname({sys="Linux", node="tigers-lfs", ...}) = 0
> brk(0)                                  = 0x8049a04
> open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY)      = 3

^^^^ Successful open of ld.so.cache

> fstat64(3, {st_mode=S_IFREG|0644, st_size=3102, ...}) = 0

^^^^ It's only 3102 bytes

> mmap2(NULL, 3102, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40014000

^^^^ So map it into memory

> close(3)                                = 0
> open("/lib/libc.so.6", O_RDONLY)        = 3
> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340[\1"..., 1024) = 1024
> fstat64(3, {st_mode=S_IFREG|0755, st_size=1263608, ...}) = 0
> mmap2(NULL, 1254820, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40015000

^^^^ map in libc

> mprotect(0x40140000, 30116, PROT_NONE)  = 0
> mmap2(0x40140000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x12a) = 0x40140000
> mmap2(0x40145000, 9636, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40145000
> close(3)                                = 0

^^^^ Map in some other stuff

> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40148000

^^^^ Test # 2

> munmap(0x40014000, 3102)                = 0

^^^^ unmap ld.so.cache since the loader is now done mapping in the share 
library.

> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40149000

^^^^ test # 3 - notice how the first map grabs the newly unmapped addresses

> write(2, "test 3 nonconsecutive pages - 40"..., 49) = 49
> semget(16, 0, 0)                        = -1 ENOSYS (Function not implemented)
> _exit(16)                               = ?


Instead of bombing on this check, it should just do a second anonymous 
mmap on x (or whichever of x and y were mapped first) and recheck for 
contiguousness. DO NOT UNMAP the first x, just let it go cause it'll be 
released on exit.

If you want to submit the change to the check, just let the gcc guys (or 
whoever) know that you encountered a race condition with a small (less 
than one memory page) ld.so.cache file that caused the check to fail 
incorrectly.

Don

-- 
Unsubscribe: send email to listar at linuxfromscratch.org
and put 'unsubscribe lfs-dev' in the subject header of the message



More information about the lfs-dev mailing list