[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160224214013.GF3522@linux.vnet.ibm.com>
Date: Wed, 24 Feb 2016 13:40:13 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>, dipankar@...ibm.com,
Andrew Morton <akpm@...ux-foundation.org>,
josh@...htriplett.org, Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
rostedt <rostedt@...dmis.org>,
David Howells <dhowells@...hat.com>, edumazet@...gle.com,
dvhart@...ux.intel.com, fweisbec@...il.com,
Oleg Nesterov <oleg@...hat.com>,
bobby prani <bobby.prani@...il.com>
Subject: Re: [PATCH tip/core/rcu 02/14] documentation: Fix control dependency
and identical stores
On Wed, Feb 24, 2016 at 09:12:04PM +0000, Mathieu Desnoyers wrote:
> ----- On Feb 24, 2016, at 12:00 AM, Paul E. McKenney paulmck@...ux.vnet.ibm.com wrote:
>
> > The summary of the "CONTROL DEPENDENCIES" section incorrectly states that
> > barrier() may be used to prevent compiler reordering when more than one
> > leg of the control-dependent "if" statement start with identical stores.
> > This is incorrect at high optimization levels. This commit therefore
> > updates the summary to match the detailed description.
> >
> > Reported by: Jianyu Zhan <nasa4836@...il.com>
> > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > ---
> > Documentation/memory-barriers.txt | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/memory-barriers.txt
> > b/Documentation/memory-barriers.txt
> > index 904ee42d078e..e26058d3e253 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -800,9 +800,13 @@ In summary:
> > use smp_rmb(), smp_wmb(), or, in the case of prior stores and
> > later loads, smp_mb().
> >
> > - (*) If both legs of the "if" statement begin with identical stores
> > - to the same variable, a barrier() statement is required at the
> > - beginning of each leg of the "if" statement.
> > + (*) If both legs of the "if" statement begin with identical stores to
> > + the same variable, then those stores must be ordered, either by
> > + preceding both of them with smp_mb() or by using smp_store_release()
> > + to carry out the stores. Please note that it is -not- sufficient
> > + to use barrier() at beginning of each leg of the "if" statement,
> > + as optimizing compilers do not necessarily respect barrier()
> > + in this case.
>
> Hrm, I really don't understand this one.
>
> One caveat, as stated here, would be that optimizing compilers
> can reorder instruction with respect to barrier() placed at the
> beginning of if/else legs that start with identical stores.
>
> It goes on stating that "smp_mb() or smp_store_release()" should
> be used rather than barrier() in those cases.
>
> I don't get how, from a compiler optimization perspective,
> barrier() is any different from smp_mb().
>
> #define barrier() __asm__ __volatile__("": : :"memory")
>
> vs
>
> #define mb() asm volatile("mfence":::"memory")
>
> What the compiler would observe is a "memory" clobber in both
> cases.
>
> Now if the stated cause of this issue would have been
> internal reordering of those identical stores within the
> processor, I would understand that smp_mb() has an
> effect which differs from the compiler barrier, but since
> the paragraph begins by stating that this is purely for
> compiler optimizations, I'm confused.
>
> What am I missing there ?
>
> Thanks,
>
> Mathieu
>
>
> >
> > (*) Control dependencies require at least one run-time conditional
> > between the prior load and the subsequent store, and this
Let's take the example, replace barrier() with smp_mb(), and see what
happens:
q = READ_ONCE(a);
if (q) {
smp_mb();
WRITE_ONCE(b, p);
do_something();
} else {
smp_mb();
WRITE_ONCE(b, p);
do_something_else();
}
Given the same compiler transformation:
q = READ_ONCE(a);
smp_mb();
WRITE_ONCE(b, p); /* BUG: No ordering vs. load from a!!! */
if (q) {
/* WRITE_ONCE(b, p); -- moved up, BUG!!! */
do_something();
} else {
/* WRITE_ONCE(b, p); -- moved up, BUG!!! */
do_something_else();
}
So ordering between the read from "a" and the write to "b" is still
preserved. The reason this works is that the smp_mb() does all the
ordering, so the fact that the control dependency has been eliminated
is irrelevant.
Thanx, Paul
Powered by blists - more mailing lists