[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.1711021245570.1277-100000@iolanthe.rowland.org>
Date: Thu, 2 Nov 2017 13:08:52 -0400 (EDT)
From: Alan Stern <stern@...land.harvard.edu>
To: Peter Zijlstra <peterz@...radead.org>
cc: "Reshetova, Elena" <elena.reshetova@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"keescook@...omium.org" <keescook@...omium.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"ishkamiel@...il.com" <ishkamiel@...il.com>,
Will Deacon <will.deacon@....com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
<parri.andrea@...il.com>, <boqun.feng@...il.com>,
<dhowells@...hat.com>, <david@...morbit.com>
Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in
atomic_t
On Thu, 2 Nov 2017, Peter Zijlstra wrote:
> On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote:
> > On Thu, 2 Nov 2017, Peter Zijlstra wrote:
> >
> > > > Lock functions such as refcount_dec_and_lock() &
> > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as
> > > > they atomic counterparts.
> > >
> > > Nope. The atomic_dec_and_lock() provides smp_mb() while
> > > refcount_dec_and_lock() merely orders all prior load/store's against all
> > > later load/store's.
> >
> > In fact there is no guaranteed ordering when refcount_dec_and_lock()
> > returns false;
>
> It should provide a release:
>
> - if !=1, dec_not_one will provide release
> - if ==1, dec_not_one will no-op, but then we'll acquire the lock and
> dec_and_test will provide the release, even if the test fails and we
> unlock again it should still dec.
>
> The one exception is when the counter is saturated, but in that case
> we'll never free the object and the ordering is moot in any case.
Also if the counter is 0, but that will never happen if the
refcounting is correct.
> > it provides ordering only if the return value is true.
> > In which case it provides acquire ordering (thanks to the spin_lock),
> > and both release ordering and a control dependency (thanks to the
> > refcount_dec_and_test).
> >
> > > The difference is subtle and involves at least 3 CPUs. I can't seem to
> > > write up anything simple, keeps turning into monsters :/ Will, Paul,
> > > have you got anything simple around?
> >
> > The combination of acquire + release is not the same as smp_mb, because
>
> acquire+release is nothing, its release+acquire that I meant which
> should order things locally, but now that you've got me looking at it
> again, we don't in fact do that.
>
> So refcount_dec_and_lock() will provide a release, irrespective of the
> return value (assuming we're not saturated). If it returns true, it also
> does an acquire for the lock.
>
> But combined they're acquire+release, which is unfortunate.. it means
> the lock section and the refcount stuff overlaps, but I don't suppose
> that's actually a problem. Need to consider more.
Right. To address your point: release + acquire isn't the same as a
full barrier either. The SB pattern illustrates the difference:
P0 P1
Write x=1 Write y=1
Release a smp_mb
Acquire b Read x=0
Read y=0
This would not be allowed if the release + acquire sequence was
replaced by smp_mb. But as it stands, this is allowed because nothing
prevents the CPU from interchanging the order of the release and the
acquire -- and then you're back to the acquire + release case.
However, there is one circumstance where this interchange isn't
allowed: when the release and acquire access the same memory
location. Thus:
P0(int *x, int *y, int *a)
{
int r0;
WRITE_ONCE(*x, 1);
smp_store_release(a, 1);
smp_load_acquire(a);
r0 = READ_ONCE(*y);
}
P1(int *x, int *y)
{
int r1;
WRITE_ONCE(*y, 1);
smp_mb();
r1 = READ_ONCE(*x);
}
exists (0:r0=0 /\ 1:r1=0)
This is forbidden. It would remain forbidden even if the smp_mb in P1
were replaced by a similar release/acquire pair for the same memory
location.
To see the difference between smp_mb and release/acquire requires three
threads:
P0 P1 P2
Write x=1 Read y=1 Read z=1
Release a data dep. smp_rmb
Acquire a Write z=1 Read x=0
Write y=1
The Linux Kernel Memory Model allows this execution, although as far as
I know, no existing hardware will do it. But with smp_mb in P0, the
execution would be forbidden.
None of this should be a problem for refcount_dec_and_lock, assuming it
is used purely for reference counting.
Alan Stern
Powered by blists - more mailing lists