lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200610140210.GT4455@paulmck-ThinkPad-P72>
Date:   Wed, 10 Jun 2020 07:02:10 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Joel Fernandes <joel@...lfernandes.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Josh Triplett <josh@...htriplett.org>
Subject: Re: [PATCH 01/10] rcu: Directly lock rdp->nocb_lock on nocb code
 entrypoints

On Wed, Jun 10, 2020 at 03:12:39PM +0200, Frederic Weisbecker wrote:
> On Tue, Jun 09, 2020 at 11:02:27AM -0700, Paul E. McKenney wrote:
> > > > > And anyway we still want to unconditionally lock on many places,
> > > > > regardless of the offloaded state. I don't know how we could have
> > > > > a magic helper do the unconditional lock on some places and the
> > > > > conditional on others.
> > > > 
> > > > I was assuming (perhaps incorrectly) that an intermediate phase between
> > > > not-offloaded and offloaded would take care of all of those cases.
> > > 
> > > Perhaps partly but I fear that won't be enough.
> > 
> > One approach is to rely on RCU read-side critical sections surrounding
> > the lock acquisition and to stay in the intermediate phase until a grace
> > period completes, preferably call_rcu() instead of synchronize_rcu().
> > 
> > This of course means refusing to do a transition if the CPU is still
> > in the intermediate state from a prior transition.
> 
> That sounds good. But using synchronize_rcu() would be far easier. We
> need to keep the hotplug and rcu barrier locked during the transition.
> 
> > > Also I've been thinking that rcu_nocb_lock() should meet any of these
> > > requirements:
> > > 
> > > * hotplug is locked
> > > * rcu barrier is locked
> > > * rnp is locked
> > > 
> > > Because checking the offloaded state (when nocb isn't locked yet) of
> > > an rdp without any of the above locks held is racy. And that should
> > > be easy to check and prevent from copy-pasta accidents.
> > > 
> > > What do you think?
> > 
> > An RCU read-side critical section might be simpler.
> 
> Ok I think I can manage that.

And just to argue against myself...

Another approach is to maintain explicit multiple states for each
->cblist, perhaps something like this:

1.	In softirq.  Transition code advances to next.
2.	To no-CB 1.  Either GP or CB kthread for the transitioning
	CPU advances to next.  Note that the fact that the
	transition code runs on the transitioning CPU means that
	the RCU softirq handler doesn't need to be involved.
3.	To no-CB 2.  Either GP or CB kthread for the transitioning
	CPU advances to next.
4.	To no-CB 3.  Transitioning code advances to next.
	At this point, the no-CBs setup is fully functional.
5.	No-CB.  Transitioning code advances to next.
	Again, the fact that the transitioning code is running
	on the transitioning CPU with interrupts disabled means
	that the RCU softirq handler need not be explicitly
	involved.
6.	To softirq 1.  The RCU softirq handler for the transitioning
	CPU advances to next.
	At this point, the RCU softirq handler is processing callbacks.
7.	To softirq 2.  Either GP or CB kthread for the transitioning
	CPU advances to next.
	At this point, the softirq handler is processing callbacks.
8.	To softirq 3.  Either GP or CB kthread for the transitioning
	CPU advances to next.
	At this point, the no-CBs setup is fully shut down.
9.	To softirq 4.  Transitioning code advances to next,
	which is the first, "In softirq".
	(This one -might- be unnecessary, but...)

All transitions are of course with the ->nocb_lock held.

When there is only one CPU during early boot near rcu_init() time,
the transition from "In softirq" to "No-CB" can remain be instantaneous.

This has the advantage of not slowing things down just because there
is an RCU callback flood in progress.  It also uses an explicit
protocol that should (give or take bugs) maintain full safety both
in protection of ->cblist and in dealing with RCU callback floods.

Thoughts?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ