lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <alpine.LNX.2.00.0911302027020.1345@bruno>
Date:	Mon, 30 Nov 2009 22:27:32 -0600 (CST)
From:	Joseph Parmelee <jparmele@...dbear.com>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Darren Hart <dvhltc@...ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Dinakar Guniguntala <dino@...ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: futex_cmpxchg_enabled not set in futex_init on pentium3




On Mon, 30 Nov 2009, Thomas Gleixner wrote:

> Can you please printk the return value of that cmpxchg() test and
> provide a full bootlog (dmesg) of your machine ?


Thanks for the responses, which verify that my understanding of the expected
behavior was correct.  That means that we have a real bug.

Attached is the complete output of dmesg from the most recent boot run.


This line due to added printk in futex_init:

[    0.147626] futex_init curval = F0006AA0


These lines due to added printk's in
arch/x86/include/asm/futex.h:futex_atomic_cmpxchg_inatomic().

[    0.147384] cmpxchg: ax before=cf80e000, ax after=f0006aa0
[    0.147444] cmpxchg: bx before=0, bx after=0
[    0.147536] cmpxchg: cx before=0, cx after=0

The compiler generates cmpxchg %ecx,(%ebx), so I added extended asm to
dump the registers involved just before and after the cmpxchg into variables
for printk.

All is consistent with the fact that the fault is not occurring and the
cmpxchg is working "as expected" at address 0.  Examining /proc/kcore with
gdb shows that address c0000000 contains f0006aa0.  Direct access with gdb
to address 0 fails as expected.

To completely eliminate any possibility that the fault was getting lost in
the fixup code somehow, I removed all the fixup code from the cmpxchg
extended asm, and the results are exactly the same.  In fact this run is
with the fixup code removed.  The fault is not occurring.


> That'd be a serious bug as it would let every NULL pointer dereference
> in the kernel proceed.

Interestingly, a printk inserted in futex_init that attempts a null
dereference results in an oops as expected.

>
> Could you also please do a quick check in which kernel version this
> got introduced ?

This was known to be working in 2.6.28.6.  Unfortunately, I didn't find it
until I updated glibc and ran its test suite on 2.6.31.5.  However, I have
been noticing some nasty log messages about page allocation failures in pppd
with plenty of available memory starting from sometime in the 2.6.31 series. 
One of them is also attached FWIW.  But these didn't seem to be causing any
problems other than making me nervous.

I am located in the mountains of Costa Rica with only a very slow dialup, so
git bisect is not an option for me.  But I do have old copies of vmlinuz
lying around that go back to the 2.6.29 series.  Unfortunately, I don't have
the matching unstripped vmlinux which would allow debugging, but I might be
able to test other ways.  I will post again as soon as I have something.

In the meantime, if you can think of any tests that you want to run, I will
be most happy to help.

Best regards,

Joseph

View attachment "dmesg" of type "TEXT/PLAIN" (16288 bytes)

View attachment "log" of type "TEXT/PLAIN" (4806 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ