[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230112003627.GA3133092@paulmck-ThinkPad-P17-Gen-1>
Date: Wed, 11 Jan 2023 16:36:27 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: riel@...riel.com, davej@...emonkey.org.uk
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: [PATCH diagnostic qspinlock] Diagnostics for excessive lock-drop
wait loop time
We see systems stuck in the queued_spin_lock_slowpath() loop that waits
for the lock to become unlocked in the case where the current CPU has
set pending state. Therefore, this not-for-mainline commit gives a warning
that includes the lock word state if the loop has been spinning for more
than 10 seconds. It also adds a WARN_ON_ONCE() that complains if the
lock is not in pending state.
If this is to be placed in production, some reporting mechanism not
involving spinlocks is likely needed, for example, BPF, trace events,
or some combination thereof.
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index ac5a3e6d3b564..be1440782c4b3 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -379,8 +379,22 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
* clear_pending_set_locked() implementations imply full
* barriers.
*/
- if (val & _Q_LOCKED_MASK)
- atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK));
+ if (val & _Q_LOCKED_MASK) {
+ int cnt = _Q_PENDING_LOOPS;
+ unsigned long j = jiffies + 10 * HZ;
+ struct qspinlock qval;
+ int val;
+
+ for (;;) {
+ val = atomic_read_acquire(&lock->val);
+ atomic_set(&qval.val, val);
+ WARN_ON_ONCE(!(val & _Q_PENDING_VAL));
+ if (!(val & _Q_LOCKED_MASK))
+ break;
+ if (!--cnt && !WARN(time_after(jiffies, j), "%s: Still pending and locked: %#x (%c%c%#x)\n", __func__, val, ".L"[!!qval.locked], ".P"[!!qval.pending], qval.tail))
+ cnt = _Q_PENDING_LOOPS;
+ }
+ }
/*
* take ownership and clear the pending bit.
Powered by blists - more mailing lists