linux-kernel - Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus test)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230126184802.GF2948950@paulmck-ThinkPad-P17-Gen-1>
Date:   Thu, 26 Jan 2023 10:48:02 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
Cc:     Alan Stern <stern@...land.harvard.edu>,
        Andrea Parri <parri.andrea@...il.com>,
        Jonas Oberhauser <jonas.oberhauser@...wei.com>,
        Peter Zijlstra <peterz@...radead.org>, will <will@...nel.org>,
        "boqun.feng" <boqun.feng@...il.com>, npiggin <npiggin@...il.com>,
        dhowells <dhowells@...hat.com>,
        "j.alglave" <j.alglave@....ac.uk>,
        "luc.maranget" <luc.maranget@...ia.fr>, akiyks <akiyks@...il.com>,
        dlustig <dlustig@...dia.com>, joel <joel@...lfernandes.org>,
        urezki <urezki@...il.com>,
        quic_neeraju <quic_neeraju@...cinc.com>,
        frederic <frederic@...nel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus
 test)

On Thu, Jan 26, 2023 at 01:17:49PM +0100, Jonas Oberhauser wrote:
> 
> 
> On 1/26/2023 2:53 AM, Paul E. McKenney wrote:
> > On Wed, Jan 25, 2023 at 08:45:44PM -0500, Alan Stern wrote:
> > > On Wed, Jan 25, 2023 at 03:33:08PM -0800, Paul E. McKenney wrote:
> > > > Ah, and returning to the earlier question as to whether srcu_read_unlock()
> > > > can use release semantics instead of smp_mb(), at the very least, this
> > > > portion of the synchronize_srcu() function's header comment must change:
> > > > 
> > > > 	On systems with more than one CPU, when synchronize_srcu()
> > > > 	returns, each CPU is guaranteed to have executed a full
> > > > 	memory barrier since the end of its last corresponding SRCU
> > > > 	read-side critical section whose beginning preceded the call
> > > > 	to synchronize_srcu().
> > > 
> > > Of course, there might be code relying on a guarantee that
> > > srcu_read_unlock() executes a full memory barrier.  This guarantee would
> > > certainly no longer hold.  But as I understand it, this guarantee was
> > > never promised by the SRCU subsystem.
> > That indented sentence was copied from the synchronize_srcu() function's
> > header comment, which might be interpreted by some as a promise by the
> > SRCU subsystem.
> 
> I think we understand that it is a promise of the SRCU subsystem, the
> question is just what the promise is.
> As Alan said, if the promise is interpreted as something like
> 
> "every store that propagated to the read side critical section must have
> propagated to all CPUs before the  synchronize_srcu() ends" (where the RSCS
> and synchronize_srcu() calls are those from the promise)
> 
> then that guarantee holds even if you only use a release fence to
> communicate the end of the RSCS to the GP. Note that this interpretation is
> analogous to the promise of smp_mb__after_unlock_lock(), which says that an
> UNLOCK+LOCK pair act as a full fence: here the read-side unlock+gp act as a
> full memory barrier.

Good point that the existing smp_mb__after_unlock_lock() can be used for
any use cases relying on the more literal interpretation of this promise.
We already have the work-around!  ;-)

> On the other hand, if the promise is more literally interpreted as
> 
> "there is a (possibly virtual) instruction in the reader-side execution
> stream that acts as a full memory barrier, and that instruction is executed
> before the  synchronize_srcu() ends"
> 
> then that guarantee is violated, and I suppose you might be able to write
> some absurd client that inspects every store of the reader thread and sees
> that there is no line in the reader side code that acts like a full fence.
> But it would take a lot of effort to discern this.

The usual litmus test is shown at the end of this email.  If you remove
the "//" from any of those smp_mb() calls, the test is forbidden, but
with all of them commented out, it is allowed.  Which illustrates the
utility of smp_mb__after_unlock_lock().  It also shows that LKMM does
not model this guarantee from synchronize_srcu()'s comment header.
Which might be fine, actually.

Of course, I just now wrote this litmus test, so it should be viewed
with extreme suspicion.

> Perhaps someone interpreting the promise like this might however come to the
> conclusion that because the only part of the code that is actually under
> control of srcu, and hence the only code where that full barrier could be
> hidden, would be inside the srcu_unlock(), they might expect to always find
> this full barrier there and treat srcu_unlock() in general as a full
> barrier. Considering that the wording explicitly isn't "an srcu_unlock() is
> a full barrier", I hope few people would have this unhealthy idea. But you
> never know.

Given that the more literal interpretation is not unreasonable, we should
assume that someone somewhere might have interpreted it that way.

But I agree that the odds of someone actually relying on this are low,
and any such use case can be fixed with smp_mb__before_srcu_read_unlock(),
similar to smp_mb__after_srcu_read_unlock() that you note is already in use.

It would still be good to scan SRCU use for this sort of pattern, maybe
manually, maybe via something like coccinelle.  Alternatively, I could
post on my blog (with right of first refusal to LWN and you guys as
co-authors) telling the community of our intent to change this and see
what people say.  Probably both rather than either/or.

Thoughts?

							Thanx, Paul

------------------------------------------------------------------------

C C-srcu-observed-6

(*
 * Result: Sometimes
 *
 * The result is Never if any of the smp_mb() calls is uncommented.
 *)

{}

P0(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
	int r1;
	int r2;
	int r3;
	int r4;

	r1 = srcu_read_lock(s);
	WRITE_ONCE(*b, 2);
	r2 = READ_ONCE(*a);
	// smp_mb();
	srcu_read_unlock(s, r1);
	// smp_mb();
	r3 = READ_ONCE(*c);
	// smp_mb();
	r4 = READ_ONCE(*d);
}

P1(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
	WRITE_ONCE(*b, 1);
	synchronize_srcu(s);
	WRITE_ONCE(*c, 1);
}

P2(int *a, int *b, int *c, int *d, struct srcu_struct *s)
{
	WRITE_ONCE(*d, 1);
	smp_mb();
	WRITE_ONCE(*a, 1);
}

exists (0:r2=1 /\ 0:r3=1 /\ 0:r4=0 /\ b=1)