[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 10 Oct 2019 14:13:47 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Waiman Long <longman@...hat.com>,
Davidlohr Bueso <dave@...olabs.net>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
1vier1@....de, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: wake_q memory ordering
Hi Peter,
On 10/10/19 1:42 PM, Peter Zijlstra wrote:
> On Thu, Oct 10, 2019 at 12:41:11PM +0200, Manfred Spraul wrote:
>> Hi,
>>
>> Waiman Long noticed that the memory barriers in sem_lock() are not really
>> documented, and while adding documentation, I ended up with one case where
>> I'm not certain about the wake_q code:
>>
>> Questions:
>> - Does smp_mb__before_atomic() + a (failed) cmpxchg_relaxed provide an
>> ordering guarantee?
> Yep. Either the atomic instruction implies ordering (eg. x86 LOCK
> prefix) or it doesn't (most RISC LL/SC), if it does,
> smp_mb__{before,after}_atomic() are a NO-OP and the ordering is
> unconditinoal, if it does not, then smp_mb__{before,after}_atomic() are
> unconditional barriers.
And _relaxed() differs from "normal" cmpxchg only for LL/SC
architectures, correct?
Therefore smp_mb__{before,after}_atomic() may be combined with
cmpxchg_relaxed, to form a full memory barrier, on all archs.
[...]
>> - Is it ok that wake_up_q just writes wake_q->next, shouldn't
>> smp_store_acquire() be used? I.e.: guarantee that wake_up_process()
>> happens after cmpxchg_relaxed(), assuming that a failed cmpxchg_relaxed
>> provides any ordering.
> There is no such thing as store_acquire, it is either load_acquire or
> store_release. But just like how we can write load-aquire like
> load+smp_mb(), so too I suppose we could write store-acquire like
> store+smp_mb(), and that is exactly what is there (through the implied
> barrier of wake_up_process()).
Thanks for confirming my assumption:
The code is correct, due to the implied barrier inside wake_up_process().
[...]
>> rewritten:
>>
>> start condition: A = 1; B = 0;
>>
>> CPU1:
>> B = 1;
>> RELEASE, unlock LockX;
>>
>> CPU2:
>> lock LockX, ACQUIRE
>> if (LOAD A == 1) return; /* using cmp_xchg_relaxed */
>>
>> CPU2:
>> A = 0;
>> ACQUIRE, lock LockY
>> smp_mb__after_spinlock();
>> READ B
>>
>> Question: is A = 1, B = 0 possible?
> Your example is incomplete (there is no A=1 assignment for example), but
> I'm thinking I can guess where that should go given the earlier text.
A=1 is listed as start condition. Way before, someone did wake_q_add().
> I don't think this is broken.
Thanks.
--
Manfred
Powered by blists - more mailing lists