linux-kernel - Re: [PATCH] random: use correct memory barriers for crng_node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200922190936.GB1616407@gmail.com>
Date:   Tue, 22 Sep 2020 12:09:36 -0700
From:   Eric Biggers <ebiggers@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Herbert Xu <herbert@...dor.apana.org.au>, tytso@....edu,
        linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
        stable@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] random: use correct memory barriers for crng_node_pool

On Tue, Sep 22, 2020 at 11:31:00AM -0700, Paul E. McKenney wrote:
> > Also, it's not just the p == &global_variable case.  Consider:
> > 
> > struct a { struct b *b; };
> > struct b { ... };
> > 
> > Thread 1:
> > 
> > 	/* one-time initialized data shared by all instances of b */
> > 	static struct c *c;
> > 
> > 	void init_b(struct a *a)
> > 	{
> > 		if (!c)
> > 			c = alloc_c();
> > 
> > 		smp_store_release(&a->b, kzalloc(sizeof(struct b)));
> > 	}
> > 
> > Thread 2:
> > 
> > 	void use_b_if_present(struct a *a)
> > 	{
> > 		struct b *b = READ_ONCE(a->b);
> > 
> > 		if (b) {
> > 			c->... # crashes because c still appears to be NULL
> > 		}
> > 	}
> > 
> > 
> > So when the *first* "b" is allocated, the global data "c" is initialized.  Then
> > when using a "b", we expect to be able to access "c".  But there's no
> > data dependency from "b" to "c"; it's a control dependency only.
> > So smp_load_acquire() is needed, not READ_ONCE().
> > 
> > And it can be an internal implementation detail of "b"'s subsystem whether it
> > happens to use global data "c".
> 
> Given that "c" is static, these two subsystems must be in the same
> translation unit.  So I don't see how this qualifies as being internal to
> "b"'s subsystem.

You're missing the point here.  b and c could easily be allocated by a function
alloc_b() that's in another file.

> Besides which, control dependencies should be used only by LKMM experts
> at this point.  

What does that even mean?  Control dependencies are everywhere.

> > This sort of thing is why people objected to the READ_ONCE() optimization during
> > the discussion at
> > https://lkml.kernel.org/linux-fsdevel/20200717044427.68747-1-ebiggers@kernel.org/T/#u.
> > Most kernel developers aren't experts in the LKMM, and they want something
> > that's guaranteed to be correct without having to to think really hard about it
> > and make assumptions about the internal implementation details of other
> > subsystems, how compilers have implemented the C standard, and so on.
> 
> And smp_load_acquire()is provided for that reason.  Its name was
> even based on the nomenclature used in the C standard and elsewhere.
> And again, control dependencies are for LKMM experts, as they are very
> tricky to get right.

How does a developer know that the code they're calling in another subsystem
wasn't written by one of these "experts" and therefore has a control dependency?

> 
> But in the LKMM documentation, you are likely to find LKMM experts who
> want to optimize all the way, particularly in cases like the one-time
> init pattern where all the data is often local.  And the best basis for
> READ_ONCE() in one-time init is not a control dependency, but rather
> ordering of accesses to a single variable from a single task combined
> with locking, both of which are quite robust and much easier to use,
> especially in comparison to control dependencies.
> 
> My goal for LKMM is not that each and every developer have a full
> understanding of every nook and cranny of that model, but instead that
> people can find the primitives supporting the desired point in the
> performance/simplicity tradoff space.  And yes, I have more writing
> to do to make more progress towards that goal.

So are you saying people should use smp_load_acquire(), or are you saying people
should use READ_ONCE()?

- Eric