linux-kernel - Re: Plain accesses and data races in the Linux Kernel Memory Model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190117150336.GA10381@andrea>
Date:   Thu, 17 Jan 2019 16:03:36 +0100
From:   Andrea Parri <andrea.parri@...rulasolutions.com>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Nicholas Piggin <npiggin@...il.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: Plain accesses and data races in the Linux Kernel Memory Model

On Wed, Jan 16, 2019 at 10:36:58PM +0100, Andrea Parri wrote:
> [...]
> 
> > The difficulty with incorporating plain accesses in the memory model
> > is that the compiler has very few constraints on how it treats plain
> > accesses.  It can eliminate them, duplicate them, rearrange them,
> > merge them, split them up, and goodness knows what else.  To make some
> > sense of this, I have taken the view that a plain access can exist
> > (perhaps multiple times) within a certain bounded region of code.
> > Ordering of two accesses X and Y means that we guarantee at least one
> > instance of the X access must be executed before any instances of the
> > Y access.  (This is assuming that neither of the accesses is
> > completely eliminated by the compiler; otherwise there is nothing to
> > order!)
> > 
> > After adding some simple definitions for the sets of plain and marked
> > accesses and for compiler barriers, the patch updates the ppo
> > relation.  The basic idea here is that ppo can be broken down into
> > categories: memory barriers, overwrites, and dependencies (including
> > dep-rfi).
> > 
> > 	Memory barriers always provide ordering (compiler barriers do
> > 	not but they have indirect effects).
> > 
> > 	Overwriting always provides ordering.  This may seem
> > 	surprising in the case where both X and Y are plain writes,
> > 	but in that case the memory model will say that X can be
> > 	eliminated unless there is at least a compiler barrier between
> > 	X and Y, and this barrier will enforce the ordering.
> > 
> > 	Some dependencies provide ordering and some don't.  Going by
> > 	cases:
> > 
> > 		An address dependency to a read provides ordering when
> > 		the source is a marked read, even when the target is a
> > 		plain read.  This is necessary if rcu_dereference() is
> > 		to work correctly; it is tantamount to assuming that
> > 		the compiler never speculates address dependencies.
> > 		However, if the source is a plain read then there is
> > 		no ordering.  This is because of Alpha, which does not
> > 		respect address dependencies to reads (on Alpha,
> > 		marked reads include a memory barrier to enforce the
> > 		ordering but plain reads do not).
> 
> Can the compiler (maybe, it does?) transform, at the C or at the "asm"
> level, LB1's P0 in LB2's P0 (LB1 and LB2 are reported below)?
> 
> C LB1
> 
> {
> 	int *x = &a;
> }
> 
> P0(int **x, int *y)
> {
> 	int *r0;
> 
> 	r0 = rcu_dereference(*x);
> 	*r0 = 0;
> 	smp_wmb();
> 	WRITE_ONCE(*y, 1);
> }
> 
> P1(int **x, int *y, int *b)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*y);
> 	rcu_assign_pointer(*x, b);
> }
> 
> exists (0:r0=b /\ 1:r0=1)
> 
> 
> C LB2
> 
> {
> 	int *x = &a;
> }
> 
> P0(int **x, int *y)
> {
> 	int *r0;
> 
> 	r0 = rcu_dereference(*x);
> 	if (*r0)
> 		*r0 = 0;
> 	smp_wmb();
> 	WRITE_ONCE(*y, 1);
> }
> 
> P1(int **x, int *y, int *b)
> {
> 	int r0;
> 
> 	r0 = READ_ONCE(*y);
> 	rcu_assign_pointer(*x, b);
> }
> 
> exists (0:r0=b /\ 1:r0=1)
> 
> LB1 and LB2 are data-race free, according to the patch; LB1's "exists"
> clause is not satisfiable, while LB2's "exists" clause is satisfiable.
> 
> I'm adding Nick to Cc (I never spoke with him, but from what I see in
> LKML, he must understand compiler better than I currently do... ;-/ )
> 
>   Andrea
> 
> 
> > 
> > 		An address dependency to a write always provides
> > 		ordering.  Neither the compiler nor the CPU can
> > 		speculate the address of a write, because a wrong
> > 		guess could generate a data race.  (Question: do we
> > 		need to include the case where the source is a plain
> > 		read?)
> > 
> > 		A data or control dependency to a write provides
> > 		ordering if the target is a marked write.  This is
> > 		because the compiler is obliged to translate a marked
> > 		write as a single machine instruction; if it
> > 		speculates such a write there will be no opportunity
> > 		to correct a mistake.
> > 
> > 		Dep-rfi (i.e., a data or address dependency from a
> > 		read to a write which is then read from on the same
> > 		CPU) provides ordering between the two reads if the
> > 		target is a marked read.  This is again because the
> > 		marked read will be translated as a machine-level load
> > 		instruction, and then the CPU will guarantee the
> > 		ordering.
> > 
> > 		There is a special case (data;rfi) that doesn't
> > 		provide ordering in itself but can contribute to other
> > 		orderings.  A data;rfi link corresponds to situations
> > 		where a value is stored in a temporary shared variable
> > 		and then loaded back again.  Since the compiler might
> > 		choose to eliminate the temporary, its accesses can't
> > 		be said to be ordered -- but the accesses around it
> > 		might be.  As a simple example, consider:
> > 
> > 			r1 = READ_ONCE(ptr);
> > 			tmp = r1;
> > 			r2 = tmp;
> > 			WRITE_ONCE(*r2, 5);
> > 
> > 		The plain accesses involving tmp don't have any
> > 		particular ordering requirements, but we do know that
> > 		the READ_ONCE must be ordered before the WRITE_ONCE.
> > 		The chain of relations is:
> > 
> > 			[marked] ; data ; rfi ; addr ; [marked]
> > 
> > 		showing that a data;rfi has been inserted into an
> > 		address dependency from a marked read to a marked
> > 		write.  In general, any number of data;rfi links can
> > 		be inserted in each of the other kinds of dependencies.

As a more general comment (disclaimer), I'm not sure we want to/can add
all the constraints above.  On one hand, for some of them, I ignore the
existence of current use cases in the source (and I don't quite see my-
self encouraging their adoption...); on the other hand, these certainly
do not make the model "simpler" or easier to maintain (in a sound way).

Moreover, I doubt that runtime checkers a la KTSan will ever be able to
assist the developer by supporting these "dependency orderings". [1]

Maybe we could start by adding those orderings that we know are "widely"
relied upon _and_ used by the developers, and later add more/strengthen
the model as needed (where feasible).

Thoughts?

  Andrea

[1] https://groups.google.com/d/msg/ktsan/bVZ1c6H2NE0/gapvllYNBQAJ