[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250815065115.289337-3-adamli@os.amperecomputing.com>
Date: Fri, 15 Aug 2025 06:51:15 +0000
From: Adam Li <adamli@...amperecomputing.com>
To: anna-maria@...utronix.de,
frederic@...nel.org,
mingo@...nel.org,
tglx@...utronix.de,
cl@...two.org
Cc: cl@...ux.com,
linux-kernel@...r.kernel.org,
patches@...erecomputing.com,
Adam Li <adamli@...amperecomputing.com>
Subject: [PATCH 2/2] tick/nohz: Trigger warning when CPU in wrong NOHZ idle state
This patch is for debug only.
Warning is triggerred when CPU is in this state:
1) tick was already stopped before tick_nohz_idle_stop_tick()
stops the tick
2) and CPU is not in nohz.idle_cpus_mask
3) and CPU is idle
4) and tick is stopped
CPU will stay idle in this state, since neither the periodic nor
the NOHZ idle load balancing can move task to this CPU.
Signed-off-by: Adam Li <adamli@...amperecomputing.com>
---
include/linux/sched/nohz.h | 2 ++
kernel/sched/fair.c | 5 +++++
kernel/time/tick-sched.c | 3 ++-
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h
index 0db7f67935fe..ea6e07777395 100644
--- a/include/linux/sched/nohz.h
+++ b/include/linux/sched/nohz.h
@@ -9,8 +9,10 @@
#ifdef CONFIG_NO_HZ_COMMON
extern void nohz_balance_enter_idle(int cpu);
extern int get_nohz_timer_target(void);
+extern bool nohz_balance_idle_cpu(int cpu);
#else
static inline void nohz_balance_enter_idle(int cpu) { }
+static inline bool nohz_balance_idle_cpu(int cpu) { return false; }
#endif
#ifdef CONFIG_NO_HZ_COMMON
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..cd1c17368e05 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7109,6 +7109,11 @@ static struct {
unsigned long next_blocked; /* Next update of blocked load in jiffies */
} nohz ____cacheline_aligned;
+inline bool nohz_balance_idle_cpu(int cpu)
+{
+ return cpumask_test_cpu(cpu, nohz.idle_cpus_mask);
+}
+
#endif /* CONFIG_NO_HZ_COMMON */
static unsigned long cpu_load(struct rq *rq)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index b900a120ab54..8241b14842f3 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1228,7 +1228,8 @@ void tick_nohz_idle_stop_tick(void)
ts->idle_sleeps++;
ts->idle_expires = expires;
-
+ WARN_ON_ONCE(was_stopped && !nohz_balance_idle_cpu(cpu) &&
+ idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu));
if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
if (!was_stopped)
ts->idle_jiffies = ts->last_jiffies;
--
2.34.1
Powered by blists - more mailing lists