lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120807115647.GA12828@mudshark.cambridge.arm.com>
Date:	Tue, 7 Aug 2012 12:56:47 +0100
From:	Will Deacon <will.deacon@....com>
To:	linux-kernel@...r.kernel.org
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Chris Mason <chris.mason@...ionio.com>,
	Nicolas Pitre <nico@...xnic.net>,
	Arnd Bergmann <arnd@...db.de>,
	linux-arm-kernel@...ts.infradead.org
Subject: RFC: mutex: hung tasks on SMP platforms with asm-generic/mutex-xchg.h

Hello,

ARM recently moved to asm-generic/mutex-xchg.h for its mutex implementation
after our previous implementation was found to be missing some crucial
memory barriers. However, I'm seeing some problems running hackbench on
SMP platforms due to the way in which the MUTEX_SPIN_ON_OWNER code operates.

The symptoms are that a bunch of hackbench tasks are left waiting on an
unlocked mutex and therefore never get woken up to claim it. I think this
boils down to the following sequence:


        Task A        Task B        Task C        Lock value
0                                                     1
1       lock()                                        0
2                     lock()                          0
3                     spin(A)                         0
4       unlock()                                      1
5                                   lock()            0
6                     cmpxchg(1,0)                    0
7                     contended()                    -1
8       lock()                                        0
9       spin(C)                                       0
10                                  unlock()          1
11      cmpxchg(1,0)                                  0
12      unlock()                                      1


At this point, the lock is unlocked, but Task B is in an uninterruptible
sleep with nobody to wake it up.

The following patch fixes the problem by ensuring we put the lock into
the contended state if we acquire it from the spin loop on the slowpath
but I'd like to be sure that this won't cause problems with other mutex
implementations:


diff --git a/kernel/mutex.c b/kernel/mutex.c
index a307cc9..27b7887 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -170,7 +170,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
                if (owner && !mutex_spin_on_owner(lock, owner))
                        break;
 
-               if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
+               if (atomic_cmpxchg(&lock->count, 1, -1) == 1) {
                        lock_acquired(&lock->dep_map, ip);
                        mutex_set_owner(lock);
                        preempt_enable();


All comments welcome.

Cheers,

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ