lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100623091307.GA11072@tiehlicka.suse.cz>
Date:	Wed, 23 Jun 2010 11:13:07 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Darren Hart <dvhltc@...ibm.com>
Cc:	LKML <linux-kernel@...r.kernel.org>, Nick Piggin <npiggin@...e.de>,
	Alexey Kuznetsov <kuznet@....inr.ac.ru>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: futex: race in lock and unlock&exit for robust futex with PI?

Hi,

attached you can find a simple test case which fails quite easily on the
following glibc assert:
"SharedMutexTest: pthread_mutex_lock.c:289: __pthread_mutex_lock:
  Assertion `(-(e)) != 3 || !robust' failed." "

AFAIU, this assertion says that futex syscall cannot fail with ESRCH 
for robust futex because it should either succeed or fail with
EOWNERDEAD.

We have seen this problem on SLES11 and SLES11SP1 but I was able to
reproduce it with the 2.6.34 kernel as well.

The test case is quite easy. 

Executed with a parameter it creates a test file and initializes shared,
robust pthread mutex (optionaly compile time configured with priority
inheritance) backed by the mmapped test file. Without a parameter it
mmaps the file and just locks, unlocks mutex and checks for EOWNERDEAD
(this should never happen during the test as the process never dies with
the lock held) in the loop.

If I run this application for multiple users in parallel I can see the
above assertion. However, if priority inheritance is turned off then
there is no problem. I am not able to reproduce also if the test case is
run under a single user.

I am using the attached runSimple.sh script to run the test case like
this:

rm test.file simple
for i in `seq 10` 
do 
	sh runSimple.sh
done

To disable IP just comment out USE_PI variable in the script.
You need to change USER1 and USER2 variables to match you system. You
will need to run the script as root if you do not set any special
setting to run su on behalf of those users.

I have tried to look at futex_{un}lock_pi but it is really hard to
understand. I assume that lookup_pi_state is the one which sets ESRCH
after it is not able to find the pid of the current owner. 

This would suggest that we are racing with the unlock of the current
lock holder but I don't see how is this possible as both lock and unlock
paths hold fshared lock for all operations over the lock value. I have
noticed that the lock path drops fshared if the current holder is dying
but then it retries the whole process again.

Any advice would be highly appreciated.

Let me know if you need any further information

Thanks
-- 
Michal Hocko
L3 team 
SUSE LINUX s.r.o.
Lihovarska 1060/12
190 00 Praha 9    
Czech Republic

Download attachment "runSimple.sh" of type "application/x-sh" (951 bytes)

View attachment "simple.c" of type "text/x-csrc" (2657 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ