[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <538C86E2.1070806@hp.com>
Date: Mon, 02 Jun 2014 10:14:58 -0400
From: Waiman Long <waiman.long@...com>
To: Mikulas Patocka <mpatocka@...hat.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>, jejb@...isc-linux.org,
deller@....de, John David Anglin <dave.anglin@...l.net>,
linux-parisc@...r.kernel.org, linux-kernel@...r.kernel.org,
chegu_vinod@...com, paulmck@...ux.vnet.ibm.com, tglx@...utronix.de,
riel@...hat.com, akpm@...ux-foundation.org, davidlohr@...com,
hpa@...or.com, andi@...stfloor.org, aswin@...com,
scott.norton@...com, Jason Low <jason.low2@...com>
Subject: Re: [PATCH] fix a race condition in cancelable mcs spinlocks
On 06/01/2014 01:53 PM, Mikulas Patocka wrote:
> The code in kernel/locking/mcs_spinlock.c is broken.
The osq_lock and osq_unlock functions aren't the only ones that need to
be changed, the mcs_spin_lock and mcs_spin_unlock have exactly the same
problem. There aren't certainly problems in other places as well.
> PA-RISC doesn't have xchg or cmpxchg atomic instructions like other
> processors. It only has ldcw and ldcd instructions that load a word (or
> doubleword) from memory and atomically store zero at the same location.
> These instructions can only be used to implement spinlocks, direct
> implementation of other atomic operations is impossible.
>
> Consequently, Linux xchg and cmpxchg functions are implemented in such a
> way that they hash the address, use the hash to index a spinlock, take the
> spinlock, perform the xchg or cmpxchg operation non-atomically and drop
> the spinlock.
>
> If you write to some variable with ACCESS_ONCE and use cmpxchg or xchg at
> the same time, you break it. ACCESS_ONCE doesn't take the hashed spinlock,
> so, in this case, cmpxchg or xchg isn't really atomic at all.
>
> This patch fixes the bug by introducing a new type atomic_pointer_t
> (backed by atomic_long_t) and replacing the offending pointer with it.
> atomic_long_set takes the hashed spinlock, so it avoids the race
> condition.
I believe the mixing of cmpxchg/xchg and ACCESS_ONCE() is fairly common
in the kernel, it will be an additional burden on the kernel developers
to make sure that this kind of breakage won't happen. We also need clear
documentation somewhere to document this kind of architecture specific
behavior, maybe in the memory-barrier.txt.
> Index: linux-3.15-rc7/kernel/locking/mcs_spinlock.h
> ===================================================================
> --- linux-3.15-rc7.orig/kernel/locking/mcs_spinlock.h 2014-05-31 19:01:01.000000000 +0200
> +++ linux-3.15-rc7/kernel/locking/mcs_spinlock.h 2014-06-01 14:17:49.000000000 +0200
> @@ -13,6 +13,7 @@
> #define __LINUX_MCS_SPINLOCK_H
>
> #include<asm/mcs_spinlock.h>
> +#include<linux/atomic.h>
>
> struct mcs_spinlock {
> struct mcs_spinlock *next;
> @@ -119,7 +120,8 @@ void mcs_spin_unlock(struct mcs_spinlock
> */
>
> struct optimistic_spin_queue {
> - struct optimistic_spin_queue *next, *prev;
> + atomic_pointer_t next;
> + struct optimistic_spin_queue *prev;
> int locked; /* 1 if lock acquired */
> };
Is there a way to do it without changing the pointer type? It will make
the code harder to read and understand.
-Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists