lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20251013150959.298288-1-steve.wahl@hpe.com>
Date: Mon, 13 Oct 2025 10:09:59 -0500
From: Steve Wahl <steve.wahl@....com>
To: Steve Wahl <steve.wahl@....com>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>,
        Ingo Molnar <mingo@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org
Cc: Russ Anderson <rja@....com>, Dimitri Sivanich <sivanich@....com>,
        Kyle Meyer <kyle.meyer@....com>
Subject: [PATCH] tick/sched: Use trylock for jiffies updates by non-timekeeper CPUs

On large NUMA systems, while running a test program that saturates the
inter-proccesor and inter-NUMA links, acquiring the jiffies_lock can
be very expensive.  If the cpu designated to do jiffies updates
(tick_do_timer_cpu) gets delayed and other cpus decide to do the
jiffies update themselves, a large number of them decide to do so at
the same time.  The inexpensive check against tick_next_period is far
quicker than actually acquiring the lock, so most of these get in line
to obtain the lock.  If obtaining the lock is slow enough, this
spirals into the vast majority of CPUs continuously being stuck
waiting for this lock, just to obtain it and find out that time has
already been updated by another cpu. For example, on one random entry
to kdb by manually-injected NMI, I saw 2912 of 3840 cpus stuck here.

To avoid this, in tick_sched_do_timer() have cpus that are not the
official timekeeper only try for the lock, and if it is held by
another CPU, leave the updating of jiffies to the lock holder.  If the
update is not yet guaranteed complete, do not reset
ts->stalled_jiffies, so the check for stalled jiffies continues on the
next tick.

With this change, manually interrupting the test I find at most one
cpu in the tick_do_update_jiffies64 function.

Signed-off-by: Steve Wahl <steve.wahl@....com>
---
 kernel/time/tick-sched.c | 46 ++++++++++++++++++++++++++++++++--------
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index c527b421c865..706d4e235989 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -54,7 +54,7 @@ static ktime_t last_jiffies_update;
 /*
  * Must be called with interrupts disabled !
  */
-static void tick_do_update_jiffies64(ktime_t now)
+static bool _tick_do_update_jiffies64(ktime_t now, bool trylock)
 {
 	unsigned long ticks = 1;
 	ktime_t delta, nextp;
@@ -70,7 +70,7 @@ static void tick_do_update_jiffies64(ktime_t now)
 	 */
 	if (IS_ENABLED(CONFIG_64BIT)) {
 		if (ktime_before(now, smp_load_acquire(&tick_next_period)))
-			return;
+			return true;
 	} else {
 		unsigned int seq;
 
@@ -84,18 +84,24 @@ static void tick_do_update_jiffies64(ktime_t now)
 		} while (read_seqcount_retry(&jiffies_seq, seq));
 
 		if (ktime_before(now, nextp))
-			return;
+			return true;
 	}
 
 	/* Quick check failed, i.e. update is required. */
-	raw_spin_lock(&jiffies_lock);
+	if (trylock) {
+		/* The cpu holding the lock will do the update. */
+		if (!raw_spin_trylock(&jiffies_lock))
+			return false;
+	} else {
+		raw_spin_lock(&jiffies_lock);
+	}
 	/*
 	 * Re-evaluate with the lock held. Another CPU might have done the
 	 * update already.
 	 */
 	if (ktime_before(now, tick_next_period)) {
 		raw_spin_unlock(&jiffies_lock);
-		return;
+		return true;
 	}
 
 	write_seqcount_begin(&jiffies_seq);
@@ -147,6 +153,27 @@ static void tick_do_update_jiffies64(ktime_t now)
 
 	raw_spin_unlock(&jiffies_lock);
 	update_wall_time();
+	return true;
+}
+
+/*
+ * Obtains the lock and does not return until update is complete.
+ * Must be called with interrupts disabled.
+ */
+static void tick_do_update_jiffies64(ktime_t now)
+{
+	_tick_do_update_jiffies64(now, false);
+}
+
+/*
+ * This will return early if another cpu holds the lock.  On return,
+ * the update is in progress but may not have completed yet.
+ * Must be called with interrupts disabled.
+ * Returns false if update might not yet be completed.
+ */
+static bool tick_attempt_update_jiffies64(ktime_t now)
+{
+	return _tick_do_update_jiffies64(now, true);
 }
 
 /*
@@ -239,10 +266,11 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
 		ts->stalled_jiffies = 0;
 		ts->last_tick_jiffies = READ_ONCE(jiffies);
 	} else {
-		if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) {
-			tick_do_update_jiffies64(now);
-			ts->stalled_jiffies = 0;
-			ts->last_tick_jiffies = READ_ONCE(jiffies);
+		if (++ts->stalled_jiffies >= MAX_STALLED_JIFFIES) {
+			if (tick_attempt_update_jiffies64(now)) {
+				ts->stalled_jiffies = 0;
+				ts->last_tick_jiffies = READ_ONCE(jiffies);
+			}
 		}
 	}
 
-- 
2.26.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ