lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200921232639.GK29330@paulmck-ThinkPad-P72>
Date:   Mon, 21 Sep 2020 16:26:39 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Herbert Xu <herbert@...dor.apana.org.au>
Cc:     Eric Biggers <ebiggers@...nel.org>, tytso@....edu,
        linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
        stable@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] random: use correct memory barriers for crng_node_pool

On Tue, Sep 22, 2020 at 08:11:04AM +1000, Herbert Xu wrote:
> On Mon, Sep 21, 2020 at 08:27:14AM -0700, Paul E. McKenney wrote:
> > On Mon, Sep 21, 2020 at 06:19:39PM +1000, Herbert Xu wrote:
> > > On Thu, Sep 17, 2020 at 09:58:02AM -0700, Eric Biggers wrote:
> > > >
> > > > smp_load_acquire() is obviously correct, whereas READ_ONCE() is an optimization
> > > > that is difficult to tell whether it's correct or not.  For trivial data
> > > > structures it's "easy" to tell.  But whenever there is a->b where b is an
> > > > internal implementation detail of another kernel subsystem, the use of which
> > > > could involve accesses to global or static data (for example, spin_lock()
> > > > accessing lockdep stuff), a control dependency can slip in.
> > > 
> > > If we're going to follow this line of reasoning, surely you should
> > > be converting the RCU derference first and foremost, no?
> 
> ...
> 
> > And to Eric's point, it is also true that when you have pointers to
> > static data, and when the compiler can guess this, you do need something
> > like smp_load_acquire().  But this is a problem only when you are (1)
> > using feedback-driven compiler optimization or (2) when you compare the
> > pointer to the address of the static data.
> 
> Let me restate what I think Eric is saying.  He is concerned about
> the case where a->b and b is some opaque object that may in turn
> dereference a global data structure unconnected to a.  The case
> in question here is crng_node_pool in drivers/char/random.c which
> in turn contains a spin lock.

As long as the compiler generates code that reaches that global via
pointer a, everything will work fine.  Which it will, unless the guy
writing the code makes the mistake of introducing a comparison between the
pointer to be dereferenced and the address of the global data structure.

So this is OK:

	p = rcu_dereference(a);
	do_something(p->b);

This is not OK:

	p = rcu_dereference(a);
	if (p == &some_global_variable)
		we_really_should_not_have_done_that_comparison();
	do_something(p->b);

The reason this last is not OK is because the compiler can transform
it as follows:

	p = rcu_dereference(a);
	if (p == &some_global_variable) {
		we_really_should_not_have_done_that_comparison();
		do_something(some_global_variable.b);
	} else {
		do_something(p->b);
	}

The compiler is not allowed to make up that sort of comparison, based
on my February 2020 discussion with the standards committee.

> But this reasoning could apply to any data structure that contains
> a spin lock, in particular ones that are dereferenced through RCU.

I lost you on this one.  What is special about a spin lock?

Here is what I think you mean:

	struct foo {
		spinlock_t lock;
		int a;
		char b;
		long c;
	};

	struct foo *a;

	...

	p = rcu_dereference(a);
	BUG_ON(!p);
	if (is_this_the_one(p)) {
		spin_lock(p->lock);
		do_something_else(p);
		spin_unlock(p->lock);
	}

This should be fine.  Or were you thinking of some other example?

> So my question if this reasoning is valid, then why aren't we first
> converting rcu_dereference to use smp_load_acquire?

For LTO in ARM, rumor has it that Will is doing so.  Which was what
motivated the BoF on this topic at Linux Plumbers Conference.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ