lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170608205500.GC3721@linux.vnet.ibm.com>
Date:   Thu, 8 Jun 2017 13:55:00 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Krister Johansen <kjlx@...pleofstupid.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
        josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
        rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
        fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com,
        stable@...r.kernel.org, gregkh@...uxfoundation.org
Subject: Re: [PATCH tip/core/rcu 45/88] rcu: Add memory barriers for NOCB
 leader wakeup

On Thu, Jun 08, 2017 at 01:11:48PM -0700, Krister Johansen wrote:
> Hi Paul,
> 
> On Thu, May 25, 2017 at 02:59:18PM -0700, Paul E. McKenney wrote:
> > Wait/wakeup operations do not guarantee ordering on their own.  Instead,
> > either locking or memory barriers are required.  This commit therefore
> > adds memory barriers to wake_nocb_leader() and nocb_leader_wait().
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > ---
> >  kernel/rcu/tree_plugin.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > index 0b1042545116..573fbe9640a0 100644
> > --- a/kernel/rcu/tree_plugin.h
> > +++ b/kernel/rcu/tree_plugin.h
> > @@ -1810,6 +1810,7 @@ static void wake_nocb_leader(struct rcu_data *rdp, bool force)
> >  	if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) {
> >  		/* Prior smp_mb__after_atomic() orders against prior enqueue. */
> >  		WRITE_ONCE(rdp_leader->nocb_leader_sleep, false);
> > +		smp_mb(); /* ->nocb_leader_sleep before swake_up(). */
> >  		swake_up(&rdp_leader->nocb_wq);
> >  	}
> >  }
> > @@ -2064,6 +2065,7 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
> >  	 * nocb_gp_head, where they await a grace period.
> >  	 */
> >  	gotcbs = false;
> > +	smp_mb(); /* wakeup before ->nocb_head reads. */
> >  	for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower) {
> >  		rdp->nocb_gp_head = READ_ONCE(rdp->nocb_head);
> >  		if (!rdp->nocb_gp_head)
> 
> May I impose upon you to CC this patch to stable, and tag it as fixing
> abedf8e241?  I ran into this on a production 4.9 branch.  When I
> debugged it, I discovered that it went all the way back to 4.6.  The
> tl;dr is that at least for some environments, the missed wakeup
> manifests itself as a series of hung-task warnings to console and if I'm
> unlucky it can also generate a hang that can block interactive logins
> via ssh.

Interesting!  This is the first that I have heard that this was anything
other than a theoretical bug.  To the comment in your second URL, it is
wise to recall that a seismologist was in fact arrested for failing to
predict an earthquake.  Later acquitted/pardoned/whatever, but arrested
nonetheless.  ;-)

https://www.theguardian.com/world/2012/oct/23/jailing-italian-seismologists-scientific-community

Silliness aside, does my patch actually fix your problem in practice as
well as in theory?  If so, may I have your Tested-by?

Impressive investigative effort, by the way!

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ