lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1490345799.2766.15.camel@sipsolutions.net>
Date:   Fri, 24 Mar 2017 09:56:39 +0100
From:   Johannes Berg <johannes@...solutions.net>
To:     linux-kernel <linux-kernel@...r.kernel.org>
Cc:     Nicolai Stange <nicstange@...il.com>,
        "Paul E.McKenney" <paulmck@...ux.vnet.ibm.com>,
        gregkh <gregkh@...uxfoundation.org>, sharon.dvir@...el.com,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-wireless <linux-wireless@...r.kernel.org>
Subject: Re: deadlock in synchronize_srcu() in debugfs?

On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote:
> Isn't it possible for the following to happen?
> 
> CPU1					CPU2
> 
> mutex_lock(&M);
> 					full_proxy_xyz();
> 					srcu_read_lock(&debugfs_srcu);
> 					real_fops->xyz();
> 					mutex_lock(&M);
> debugfs_remove(F);
> synchronize_srcu(&debugfs_srcu);


So I'm pretty sure that this can happen. I'm not convinced that it's
happening here, but still.

I tried to make lockdep flag it, but the only way I could get it to
flag it was to do this:

--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -235,7 +235,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 	preempt_disable();
 	retval = __srcu_read_lock(sp);
 	preempt_enable();
-	rcu_lock_acquire(&(sp)->dep_map);
+	lock_map_acquire(&(sp)->dep_map);
 	return retval;
 }
 
@@ -249,7 +249,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
 	__releases(sp)
 {
-	rcu_lock_release(&(sp)->dep_map);
+	lock_map_release(&(sp)->dep_map);
 	__srcu_read_unlock(sp, idx);
 }
 
diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c
index ef3bcfb15b39..0f9e542ca3f2 100644
--- a/kernel/rcu/srcu.c
+++ b/kernel/rcu/srcu.c
@@ -395,6 +395,9 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount)
 			 lock_is_held(&rcu_sched_lock_map),
 			 "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section");
 
+	lock_map_acquire(&sp->dep_map);
+	lock_map_release(&sp->dep_map);
+
 	might_sleep();
 	init_completion(&rcu.completion);
 

The lock_map_acquire() in srcu_read_lock() is really not desired
though, since it will make recursion get flagged as bad. If I change
that to lock_map_acquire_read() though, the problem doesn't get flagged
for some reason. I thought it should.


Regardless though, I don't see a way to solve this problem for debugfs.
We have a ton of debugfs files in net/mac80211/debugfs.c that need to
acquire e.g. the RTNL (or other locks), and I'm not sure we can easily
avoid removing the debugfs files under the RTNL, since we get all our
configuration callbacks with the RTNL already held...

Need to think about that, but perhaps there's some other solution?

johannes

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ