linux-kernel - Re: [PATCH] Linux: Implement membarrier function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 14 Dec 2018 16:39:34 -0500 (EST)
From:   Alan Stern <stern@...land.harvard.edu>
To:     "Paul E. McKenney" <paulmck@...ux.ibm.com>
cc:     David Goldblatt <davidtgoldblatt@...il.com>,
        <mathieu.desnoyers@...icios.com>,
        Florian Weimer <fweimer@...hat.com>, <triegel@...hat.com>,
        <libc-alpha@...rceware.org>, <andrea.parri@...rulasolutions.com>,
        <will.deacon@....com>, <peterz@...radead.org>,
        <boqun.feng@...il.com>, <npiggin@...il.com>, <dhowells@...hat.com>,
        <j.alglave@....ac.uk>, <luc.maranget@...ia.fr>, <akiyks@...il.com>,
        <dlustig@...dia.com>, <linux-arch@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Linux: Implement membarrier function

On Fri, 14 Dec 2018, Paul E. McKenney wrote:

> I would say that sys_membarrier() has zero-sized read-side critical
> sections, either comprising a single instruction (as is the case for
> synchronize_sched(), actually), preempt-disable regions of code
> (which are irrelevant to userspace execution), or the spaces between
> consecutive pairs of instructions (as is the case for the newer
> IPI-based implementation).
> 
> The model picks the single-instruction option, and I haven't yet found
> a problem with this -- which is no surprise given that, as you say,
> an actual implementation makes this same choice.

I believe that for RCU tests the LKMM gives the same results for
length-zero critical sections interspersed between all the instructions
and length-one critical sections surrounding all instructions (except
synchronize_rcu).  But the proof is tricky and I haven't checked it
carefully.

> > > The other thing that took some time to get used to is the possibility
> > > of long delays during sys_membarrier() execution, allowing significant
> > > execution and reordering between different CPUs' IPIs.  This was key
> > > to my understanding of the six-process example, and probably needs to
> > > be clearly called out, including in an example or two.
> > 
> > In all the examples I'm aware of, no more than one of the IPIs
> > generated by each sys_membarrier call really matters.  (Of course,
> > there's no way to know in advance which one it will be, so you have to
> > send an IPI to every CPU.)  The execution delays and reordering
> > between different CPUs' IPIs don't appear to be significant.
> 
> Well, there are litmus tests that are allowed in which the allowed
> execution is more easily explained in terms of delays between different
> CPUs' IPIs, so it seems worth keeping track of.
> 
> There might be a litmus test that can tell the difference between
> simultaneous and non-simultaneous IPIs, but I cannot immediately think of
> one that matters.  Might be a failure of imagination on my part, though.

	P0	P1	P2
	Wc=1	[mb01]	Rb=1
	memb	Wa=1	Rc=0
	Ra=0	Wb=1	[mb02]

The IPIs have to appear in the positions shown, which means they cannot
be simultaneous.  The test is allowed because P2's reads can be
reordered.

Alan