linux-kernel - Re: [PATCH 2/3] rcu: Equip sleepable RCU with lockdep dependency graph checks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <Y8JacQO1PW7va7rf@Boquns-Mac-mini.local>
Date:   Fri, 13 Jan 2023 23:32:01 -0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Hillf Danton <hdanton@...a.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [PATCH 2/3] rcu: Equip sleepable RCU with lockdep dependency
 graph checks

On Sat, Jan 14, 2023 at 03:18:32PM +0800, Hillf Danton wrote:
> On Fri, 13 Jan 2023 16:17:59 -0800 Boqun Feng <boqun.feng@...il.com>
> > On Sat, Jan 14, 2023 at 07:58:09AM +0800, Hillf Danton wrote:
> > > On 13 Jan 2023 09:58:10 -0800 Boqun Feng <boqun.feng@...il.com>
> > > > On Fri, Jan 13, 2023 at 09:03:30PM +0800, Hillf Danton wrote:
> > > > > On 12 Jan 2023 22:59:54 -0800 Boqun Feng <boqun.feng@...il.com>
> > > > > > --- a/kernel/rcu/srcutree.c
> > > > > > +++ b/kernel/rcu/srcutree.c
> > > > > > @@ -1267,6 +1267,8 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm)
> > > > > >  {
> > > > > >  	struct rcu_synchronize rcu;
> > > > > >  
> > > > > > +	srcu_lock_sync(&ssp->dep_map);
> > > > > > +
> > > > > >  	RCU_LOCKDEP_WARN(lockdep_is_held(ssp) ||
> > > > > >  			 lock_is_held(&rcu_bh_lock_map) ||
> > > > > >  			 lock_is_held(&rcu_lock_map) ||
> > > > > > -- 
> > > > > > 2.38.1
> > > > > 
> > > > > The following deadlock is able to escape srcu_lock_sync() because the
> > > > > __lock_release folded in sync leaves one lock on the sync side.
> > > > > 
> > > > > 	cpu9		cpu0
> > > > > 	---		---
> > > > > 	lock A		srcu_lock_acquire(&ssp->dep_map);
> > > > > 	srcu_lock_sync(&ssp->dep_map);
> > > > > 			lock A
> > > > 
> > > > But isn't it just the srcu_mutex_ABBA test case in patch #3, and my run
> > > > of lockdep selftest shows we can catch it. Anything subtle I'm missing?
> > > 
> > > I am leaning to not call it ABBA deadlock, because B is unlocked.
> > > 
> > > 	task X		task Y
> > > 	---		---
> > > 	lock A
> > > 	lock B
> > > 			lock B
> > > 	unlock B
> > > 	wait_for_completion E
> > > 
> > > 			lock A
> > > 			complete E
> > > 
> > > And no deadlock should be detected/caught after B goes home.
> > 
> > Your example makes me more confused.. given the case:
> > 
> > 	task X		task Y
> > 	---		---
> > 	mutex_lock(A);
> > 			srcu_read_lock(B);
> > 	synchronze_srcu(B);
> > 			mutex_lock(A);
> > 
> > isn't it a deadlock?
> 
> Yes and nope, see below.
> 
> > If your example, A, B or E which one is srcu?
> 
> A and B are mutex, and E is completion in my example to show the failure
> of catching deadlock in case of non-fake lock. Now see srcu after your change.
> 
>  	task X			task Y
>  	---			---
>  	mutex_lock(A);
>  				srcu_read_lock(B);
> 				srcu_lock_acquire(&B->dep_map);
> 				a) lock_map_acquire_read(&B->dep_map);
>  	synchronze_srcu(B);
> 	__synchronize_srcu(B);
> 	srcu_lock_sync(&B->dep_map);
> 	lock_map_sync(&B->dep_map);
> 	lock_sync(&B->dep_map);
> 	__lock_acquire(&B->dep_map);

At this time, lockdep add dependency A -> B in the dependency graph.

> 				b) lock_map_acquire_read(&B->dep_map);
> 	__lock_release(&B->dep_map);
> 				c) lock_map_acquire_read(&B->dep_map);
>  				mutex_lock(A);

and here, lockdep will try to add dependency B -> A into the dependency
graph, and find that A -> B -> A will form a circle (with strong
dependency), therefore lockdep knows it's a deadlock.

>  
> No deadlock could be detected if taskY takes mutexA after taskX releases B,

The timing that taskX releases B doesn't master, since lockdep uses
graph to detect deadlocks rather than after-fact detection.

> and how taskY acquires B does not matter as per the a), b) and c) modes in
> the above chart, again because releasing lock can break deadlock in general.

I have test cases showing the above deadlock can be detected, so if you
think there is a deadlock that may dodge from my change, feel free to
add a test case in lib/locking-selftest.c ;-)

Regards,
Boqun