linux-kernel - [Question] lockdep: Is nested lock handled correctly?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150810095247.GA4606@fixme-laptop.cn.ibm.com>
Date:	Mon, 10 Aug 2015 17:52:47 +0800
From:	Boqun Feng <boqun.feng@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>
Cc:	linux-kernel@...r.kernel.org
Subject: [Question] lockdep: Is nested lock handled correctly?

Hi Peter and Ingo,

I'm now learning the code of lockdep and find that nested lock may not
be handled correctly because we fail to take held_lock merging into
consideration. I come up with an example and hope that could explain my
concern.

Please consider this lock/unlock sequence, I also put a patch ading this
sequence as a test into locking-selftest:

(lock_X1 and lock_X2 belong to the same lock class X, lock_Y1 belongs to
another lock class Y)

spin_lock(&lock_X1);
spin_lock(&lock_Y1);
spin_lock_nested_lock(&lock_X2, &lock_X1);
spin_unlock(&lock_Y1);
spin_unlock(&lock_X2);
spin_unlock(&lock_X1);

This is totally legal in current lockdep rules, right? But the states of
curr->held_locks stack after each lock/unlock show something
interesting:

0.	Initially:
	curr->held_locks is empty, curr->lockdep_depth: 0

1.	spin_lock(&lock_X1);
	curr->held_locks: H1(X), curr->lockdep_depth: 1

	H1(X) means a held_lock structure with ->class_idx pointing the
	class_idx of class X.

2.	spin_lock(&lock_Y1);
	curr->held_locks: H1(X)--H2(Y), curr->lockdep_depth: 2

3.	spin_lock_nested_lock(&lock_X2, &lock_X1);
	curr->held_locks: H1(X)--H2(Y)--H3(X),
	curr->lockdep_depth: 3

4.	spin_unlock(&lock_Y1);
	curr->held_locks: H1(X, references=2), curr->lockdep_depth:1

	DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1) in
	__lock_release() will be triggered, because lockdep_depth
	changes from 3 to 1!

...

This could happen in current lockdep code, and the reason is that when
releasing H2 in __lock_release(), lockdep will call __lock_acquire() to
"reacquire" H3, and __lock_acquire() detects H3 and H1 belong to the
same class, so it will merge H3 into H1.

Therefore "After releasing a held_lock in the stack, the lockdep_depth
will decrease by 1" is not true!

Besides, this hlock-merge-after-release also makes the reference
counting of held_lock goes wrong. Please consider this sequence:

spin_lock(&lock_X1);
spin_lock(&lock_Y1);
spin_lock_nested_lock(&lock_X2, &lock_X1);
spin_lock_nested_lock(&lock_X3, &lock_X2);
spin_unlock(&lock_Y1);
spin_unlock(&lock_X3);
spin_unlock(&lock_X2);
spin_unlock(&lock_X1);

After spin_unlock(&lock_Y1), the curr->held_locks will become:

curr->held_locks: H1(X, references=2), curr->lockdep_depth: 1

But, in fact, we have -three- locks held now.

It seems to me that our current code don't take
hlock-merge-after-release into consideration and this is a problem. Am I
missing something here?

Looking forward to your insight ;-)

Add a patch for test case, which is based on current tip/locking/core.
I compiled the kernel with CONFIG_DEBUG_LOCKING_API_SELFTESTS and
CONFIG_PROVE_LOCKING, and the test fails because
DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1) is triggered.

Thanks and Best Regards,
Boqun

---
 lib/locking-selftest.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 872a15a..00042f9 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -1716,6 +1716,16 @@ static void ww_test_spin_context(void)
 	U(A);
 }

+static void bad_order_nested_spin_lock(void)
+{
+	raw_spin_lock(&lock_X1);
+	raw_spin_lock(&lock_Y1);
+	raw_spin_lock_nest_lock(&lock_X2, &lock_X1);
+	raw_spin_unlock(&lock_Y1); /* bad order here */
+	raw_spin_unlock(&lock_X2);
+	raw_spin_unlock(&lock_X1);
+}
+
 static void ww_tests(void)
 {
 	printk("  --------------------------------------------------------------------------\n");
@@ -1856,6 +1866,11 @@ void locking_selftest(void)
 	dotest(rsem_AA3, FAILURE, LOCKTYPE_RWSEM);
 	printk("\n");

+	print_testname("nested spin lock with bad order");
+	printk("|");
+	dotest(bad_order_nested_spin_lock, SUCCESS, LOCKTYPE_SPIN);
+	printk("\n");
+
 	printk("  --------------------------------------------------------------------------\n");

 	/*
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/