linux-kernel - Re: [PATCH] documentation: Fix two-CPU control-dependency example

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170724043600.GO3730@linux.vnet.ibm.com>
Date:   Sun, 23 Jul 2017 21:36:00 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Akira Yokosawa <akiyks@...il.com>
Cc:     Boqun Feng <boqun.feng@...il.com>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] documentation: Fix two-CPU control-dependency example

On Mon, Jul 24, 2017 at 09:04:57AM +0900, Akira Yokosawa wrote:
> On 2017/07/23 23:39:36 +0800, Boqun Feng wrote:
> > On Sat, Jul 22, 2017 at 09:43:00PM -0700, Paul E. McKenney wrote:
> > [...]
> >>> Your priority seemed to be in reducing the chance of the "if" statement
> >>> to be optimized away.  So I suggested to use "extern" as a compromise.
> >>
> > 
> > Hi Akira,
> > 
> 
> Hi Boqun,
> 
> > The problem is that, such a compromise doesn't help *developers* write
> > good concurrent code. The document should serve as a reference book for
> > the developers, and with the compromise you suggest, the developers will
> > possibly add "extern" to their shared variables. This is not only
> > unrealistic but also wrong, because "extern" means external for
> > translation units(compiling units), not external for execution
> > units(CPUs).
> 
> Yes, I suggested it regarding the situation when the tiny litmus test
> is compiled in a translation unit. Also it might not be effective once
> link time optimization becomes "smart" enough.
> 
> And I agree it was not appropriate for memory-barriers.txt.
> 
> > 
> > And as I said, the proper semantics of READ_ONCE() should work well
> > without using "extern", if we find a 'volatile' load doesn't work, we
> > can find another way (writing in asm or use asm volatile("" : "+m"(var));
> > to indicate @var changed). And the compromise just changes the
> > semantics... To me, it's not worth changing the semantics because the
> > implementation might be broken in the feature ;-)
> 
> I agree.
> 
> > 
> > 
> >> If the various tools accept the "extern", this might not be a bad thing
> >> to do.
> >>
> >> But what this really means is that I need to take another tilt at
> >> the "volatile" windmill in the committee.
> >>
> >>> Another way would be to express the ">=" version in a pseudo-asm form.
> >>>
> >>> 	CPU 0                     CPU 1
> >>> 	=======================   =======================
> >>> 	r1 = LOAD x               r2 = LOAD y
> >>> 	if (r1 >= 0)              if (r2 >= 0)
> >>> 	  STORE y = 1               STORE x = 1
> >>>
> >>> 	assert(!(r1 == 1 && r2 == 1));
> >>>
> >>> This should eliminate any concern of compiler optimization.
> >>> In this final part of CONTROL DEPENDENCIES section, separating the
> >>> problem of optimization and transitivity would clarify the point
> >>> (at least for me).
> >>
> >> The problem is that people really do use C-language control dependencies
> >> in the Linux kernel, so we need to describe them.  Maybe someday it
> >> will be necessary to convert them to asm, but I am hoping that we can
> >> avoid that.
> >>
> >>> Thoughts?
> >>
> >> My hope is that the memory model can help here, but that will in any
> >> case take time.
> > 
> > Hi Paul,
> > 
> > I add some comments for READ_ONCE() to emphasize compilers should honor
> > the return value, in the future, we may need a separate document for the
> > use/definition of volatile in kernel, but I think the comment of
> > READ_ONCE() is good enough now?
> > 
> > Regards,
> > Boqun
> > 
> > ----------------->8
> > Subject: [PATCH] kernel: Emphasize the return value of READ_ONCE() is honored
> > 
> > READ_ONCE() is used around in kernel to provide a control dependency,
> > and to make the control dependency valid, we must 1) make the load of
> > READ_ONCE() actually happen and 2) make sure compilers take the return
> > value of READ_ONCE() serious. 1) is already done and commented,
> > and in current implementation, 2) is also considered done in the
> > same way as 1): a 'volatile' load.
> > 
> > Whereas, Akira Yokosawa recently reported a problem that would be
> > triggered if 2) is not achieved. 
> 
> To clarity the timeline, it was Paul who pointed out it would become
> easier for compilers to optimize away the "if" statements in response
> to my suggestion of partial revert (">" -> ">=").

Indeed I did.

And if nothing else, this discussion convinced me that I should push
harder on volatile.  It would be nice if we had more of a guarantee!

							Thanx, Paul

> >                                  Moreover, according to Paul Mckenney,
> > using volatile might not actually give us what we want for 2) depending
> > on compiler writers' definition of 'volatile'. Therefore it's necessary
> > to emphasize 2) as a part of the semantics of READ_ONCE(), this not only
> > fits the conceptual semantics we have been using, but also makes the
> > implementation requirement more accurate.
> > 
> > In the future, we can either make compiler writers accept our use of
> > 'volatile', or(if that fails) find another way to provide this
> > guarantee.
> > 
> > Cc: Akira Yokosawa <akiyks@...il.com>
> > Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > Signed-off-by: Boqun Feng <boqun.feng@...il.com>
> > ---
> >  include/linux/compiler.h | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 219f82f3ec1a..8094f594427c 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -305,6 +305,31 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
> >   * mutilate accesses that either do not require ordering or that interact
> >   * with an explicit memory barrier or atomic instruction that provides the
> >   * required ordering.
> > + *
> > + * The return value of READ_ONCE() should be honored by compilers, IOW,
> > + * compilers must treat the return value of READ_ONCE() as an unknown value at
> > + * compile time, i.e. no optimization should be done based on the value of a
> > + * READ_ONCE(). For example, the following code snippet:
> > + *
> > + * 	int a = 0;
> > + * 	int x = 0;
> > + *
> > + * 	void some_func() {
> > + * 		int t = READ_ONCE(a);
> > + * 		if (!t)
> > + * 			WRITE_ONCE(x, 1);
> > + * 	}
> > + *
> > + * , should never be optimized as:
> > + *
> > + * 	void some_func() {
> > + * 		WRITE_ONCE(x, 1);
> > + * 	}
> READ_ONCE() should still be honored. so maybe the following?
> 
> + * , should never be optimized as:
> + *
> + *	void some_func() {
> + *		int t = READ_ONCE(a);
> + *		WRITE_ONCE(x, 1);
> + *	}
> 
>          Thanks, Akira
> 
> > + *
> > + * because the compiler is 'smart' enough to think the value of 'a' is never
> > + * changed.
> > + *
> > + * We provide this guarantee by making READ_ONCE() a *volatile* load.
> >   */
> >  
> >  #define __READ_ONCE(x, check)						\
> > 
>