lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100414160057.GD2516@linux.vnet.ibm.com>
Date:	Wed, 14 Apr 2010 09:00:57 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Lai Jiangshan <laijs@...fujitsu.com>,
	David Miller <davem@...emloft.net>, a.p.zijlstra@...llo.nl,
	mingo@...e.hu, sparclinux@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: Weird rcu lockdep warning

On Wed, Apr 14, 2010 at 05:51:11PM +0200, Frederic Weisbecker wrote:
> On Wed, Apr 14, 2010 at 08:43:02AM -0700, Paul E. McKenney wrote:
> > On Wed, Apr 14, 2010 at 11:34:33AM +0800, Lai Jiangshan wrote:
> > > Paul E. McKenney wrote:
> > > > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote:
> > > >> From: Frederic Weisbecker <fweisbec@...il.com>
> > > >> Date: Wed, 14 Apr 2010 02:02:27 +0200
> > > >>
> > > >>> I just have a guess though....
> > > >>> This seems to always happen from NMI path, and lockdep is disabled on NMI.
> > > >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored
> > > >>> and then the rcu_read_lock_held() check has the wrong result...
> > > >> Yeah, I bet that's it too.
> > > >>
> > > >> lock_is_held() can't return anything meaningful while lockdep is
> > > >> disabled, which it is during NMIs.
> > > > 
> > > > Ah!  So I just need to add a "current->lockdep_recursion"
> > > > check to debug_lockdep_rcu_enabled().  And move the function to
> > > > kernel/rcutree_plugin.h to avoid #include hell.
> > > > 
> > > > See below for (untested) patch.
> > > > 
> > > > 						Thanx, Paul
> > > > 
> > > > ------------------------------------------------------------------------
> > > > 
> > > >  include/linux/rcupdate.h |    5 +----
> > > >  kernel/rcutree_plugin.h  |   11 +++++++++++
> > > >  2 files changed, 12 insertions(+), 4 deletions(-)
> > > > 
> > > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1
> > > > Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > > Date:   Tue Apr 13 18:45:51 2010 -0700
> > > > 
> > > >     rcu: Make RCU lockdep check the lockdep_recursion variable
> > > >     
> > > >     The lockdep facility temporarily disables lockdep checking by incrementing
> > > >     the current->lockdep_recursion variable.  Such disabling happens in NMIs
> > > >     and in other situations where lockdep might expect to recurse on itself.
> > > >     This patch therefore checks current->lockdep_recursion, disabling RCU
> > > >     lockdep splats when this variable is non-zero.
> > > >     
> > > >     Reported-by: Frederic Weisbecker <fweisbec@...il.com>
> > > >     Reported-by: David Miller <davem@...emloft.net>
> > > >     Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > > 
> > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > index 9f1ddfe..07db2fe 100644
> > > > --- a/include/linux/rcupdate.h
> > > > +++ b/include/linux/rcupdate.h
> > > > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map;
> > > >  # define rcu_read_release_sched() \
> > > >  		lock_release(&rcu_sched_lock_map, 1, _THIS_IP_)
> > > >  
> > > > -static inline int debug_lockdep_rcu_enabled(void)
> > > > -{
> > > > -	return likely(rcu_scheduler_active && debug_locks);
> > > > -}
> > > > +extern int debug_lockdep_rcu_enabled(void);
> > > >  
> > > >  /**
> > > >   * rcu_read_lock_held - might we be in RCU read-side critical section?
> > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > > > index 79b53bd..2169abe 100644
> > > > --- a/kernel/rcutree_plugin.h
> > > > +++ b/kernel/rcutree_plugin.h
> > > > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void)
> > > >  }
> > > >  
> > > >  #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
> > > > +
> > > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > > +
> > > > +int debug_lockdep_rcu_enabled(void)
> > > > +{
> > > > +	return likely(rcu_scheduler_active &&
> > > > +		      debug_locks &&
> > > > +		      current->lockdep_recursion == 0);
> > > > +}
> > > > +
> > > 
> > > Looks good to me too, but I think
> > > 'likely' is needless since the function is not inline.
> > 
> > Good point.  And to add injury to insult, I forgot EXPORT_SYMBOL_GPL().
> > 
> > Updated patch in the works.
> 
> 
> Note I just tested the patch the previous one and it looks fine now.
> You can then safely consider the "general idea" fixes the problem :)

Thank you, Frederic!!!

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ