lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Wed, 25 Oct 2006 18:43:59 -0700
From:	john stultz <johnstul@...ibm.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Sripathi Kodi <sripathik@...ibm.com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: possible pthread_exit/exit glibc problem exposed by -rt kernels

Hey Ingo,
	Sripathi Kodi and Pat Gallop have been chasing this bug for awhile, but
they've recently managed to make it reproducible w/ a simple test case
(Credits to Pat for distilling the test case down!).

It seems some form of race in the pthread_exit() and exit() code in
glibc is getting exposed by the -rt kernels.

Using the attached test case w/ following commands:
gcc -o term_test ./term_test.c -lpthread
ulimit -c unlimited
ulimit -t unlimited
export MALLOC_CHECK_=2
while true; do ./term_test > /dev/null ;done

Will, after some time, result in a number of core files. The backtrace
of which all look like:

Thread 1:
#0  0xffffe410 in __kernel_vsyscall ()
#1  0xb7e7e7d5 in raise () from /lib/tls/libc.so.6
#2  0xb7e80149 in abort () from /lib/tls/libc.so.6
#3  0xb7ebd665 in free_check () from /lib/tls/libc.so.6
#4  0xb7eb8e65 in free () from /lib/tls/libc.so.6
#5  0xb7fb5a5d in ___tls_get_addr_internal () from /lib/ld-linux.so.2
#6  0xb7f54c6b in __libc_dl_error_tsd () from /lib/tls/libc.so.6
#7  0xb7fb3045 in _dl_catch_error () from /lib/ld-linux.so.2
#8  0xb7f548be in __libc_dlsym () from /lib/tls/libc.so.6
#9  0xb7f8d2f0 in _Unwind_ForcedUnwind () from /lib/tls/libpthread.so.0
#10 0xb7f8af81 in __pthread_unwind () from /lib/tls/libpthread.so.0
#11 0xb7f86f00 in pthread_exit () from /lib/tls/libpthread.so.0
#12 0x0804865a in thread_worker (arg=0x0) at term_test.c:13
#13 0xb7f86341 in start_thread () from /lib/tls/libpthread.so.0
#14 0xb7f1e6fe in clone () from /lib/tls/libc.so.6

Thread 2:
#0  0xb7f0f890 in write () from /lib/tls/libc.so.6
#1  0xb7eb4d6f in _IO_new_file_write () from /lib/tls/libc.so.6
#2  0xb7eb37cb in _IO_new_do_write () from /lib/tls/libc.so.6
#3  0xb7eb4278 in _IO_new_file_overflow () from /lib/tls/libc.so.6
#4  0xb7eb676b in _IO_flush_all_lockp () from /lib/tls/libc.so.6
#5  0xb7eb6ae0 in _IO_cleanup () from /lib/tls/libc.so.6
#6  0xb7e814b2 in exit () from /lib/tls/libc.so.6
#7  0x0804870b in main () at term_test.c:46

>>From Sripathi:
"The thread that crashes is trying to exit (pthread_exit) and in the
process, it accesses one of it's thread local variables. When it tries
to free some memory from it's dtv, free() detects an invalid pointer.
>>From my understanding, this memory is accessed only by the thread it
belongs to, hence this cannot be due to any races in the code."

I've been looking at the glibc code trying to see how a double free or
some other memory corruption could occur, but I've made no progress, so
I figured I'd throw it your way for suggestions on hunting this down.

I've reproduced it using a 4 processor system w/ -rt kernels from
2.6.16-rt22 through 2.6.18-rt7 on RHEL4 w/ glibc-2.3.4-2.13. 

I have not been able to reproduce it w/ any non-RT kernel, or w/
CONFIG_REALTIME_PREEMPT disabled.

Your thoughts?

thanks
-john

View attachment "term_test.c" of type "text/x-csrc" (909 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ