[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171102160237.t2xkryg6joskf77y@hirez.programming.kicks-ass.net>
Date: Thu, 2 Nov 2017 17:02:37 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Alan Stern <stern@...land.harvard.edu>
Cc: "Reshetova, Elena" <elena.reshetova@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"keescook@...omium.org" <keescook@...omium.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"ishkamiel@...il.com" <ishkamiel@...il.com>,
Will Deacon <will.deacon@....com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
parri.andrea@...il.com, boqun.feng@...il.com, dhowells@...hat.com,
david@...morbit.com
Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in
atomic_t
On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote:
> On Thu, 2 Nov 2017, Peter Zijlstra wrote:
>
> > > Lock functions such as refcount_dec_and_lock() &
> > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as
> > > they atomic counterparts.
> >
> > Nope. The atomic_dec_and_lock() provides smp_mb() while
> > refcount_dec_and_lock() merely orders all prior load/store's against all
> > later load/store's.
>
> In fact there is no guaranteed ordering when refcount_dec_and_lock()
> returns false;
It should provide a release:
- if !=1, dec_not_one will provide release
- if ==1, dec_not_one will no-op, but then we'll acquire the lock and
dec_and_test will provide the release, even if the test fails and we
unlock again it should still dec.
The one exception is when the counter is saturated, but in that case
we'll never free the object and the ordering is moot in any case.
> it provides ordering only if the return value is true.
> In which case it provides acquire ordering (thanks to the spin_lock),
> and both release ordering and a control dependency (thanks to the
> refcount_dec_and_test).
>
> > The difference is subtle and involves at least 3 CPUs. I can't seem to
> > write up anything simple, keeps turning into monsters :/ Will, Paul,
> > have you got anything simple around?
>
> The combination of acquire + release is not the same as smp_mb, because
acquire+release is nothing, its release+acquire that I meant which
should order things locally, but now that you've got me looking at it
again, we don't in fact do that.
So refcount_dec_and_lock() will provide a release, irrespective of the
return value (assuming we're not saturated). If it returns true, it also
does an acquire for the lock.
But combined they're acquire+release, which is unfortunate.. it means
the lock section and the refcount stuff overlaps, but I don't suppose
that's actually a problem. Need to consider more.
Powered by blists - more mailing lists