[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <176907693368.510.564694445484485322.tip-bot2@tip-bot2>
Date: Thu, 22 Jan 2026 10:15:33 -0000
From: "tip-bot2 for Shubhang Kaushik" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Shubhang Kaushik <shubhang@...amperecomputing.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: [tip: sched/core] sched: Update rq->avg_idle when a task is moved to
an idle CPU
The following commit has been merged into the sched/core branch of tip:
Commit-ID: 4b603f1551a73e2868b9e7a14b3938c23275cefb
Gitweb: https://git.kernel.org/tip/4b603f1551a73e2868b9e7a14b3938c23275cefb
Author: Shubhang Kaushik <shubhang@...amperecomputing.com>
AuthorDate: Wed, 21 Jan 2026 01:31:53 -08:00
Committer: Peter Zijlstra <peterz@...radead.org>
CommitterDate: Thu, 22 Jan 2026 11:11:21 +01:00
sched: Update rq->avg_idle when a task is moved to an idle CPU
Currently, rq->idle_stamp is only used to calculate avg_idle during
wakeups. This means other paths that move a task to an idle CPU such as
fork/clone, execve, or migrations, do not end the CPU's idle status in
the scheduler's eyes, leading to an inaccurate avg_idle.
This patch introduces update_rq_avg_idle() to provide a more accurate
measurement of CPU idle duration. By invoking this helper in
put_prev_task_idle(), we ensure avg_idle is updated whenever a CPU
stops being idle, regardless of how the new task arrived.
Testing on an 80-core Ampere Altra (ARMv8) with 6.19-rc5 baseline:
- Hackbench : +7.2% performance gain at 16 threads.
- Schbench: Reduced p99.9 tail latencies at high concurrency.
Signed-off-by: Shubhang Kaushik <shubhang@...amperecomputing.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@...aro.org>
Tested-by: Shubhang Kaushik <shubhang@...amperecomputing.com>
Link: https://patch.msgid.link/20260121-v8-patch-series-v8-1-b7f1cbee5055@os.amperecomputing.com
---
kernel/sched/core.c | 24 ++++++++++++------------
kernel/sched/idle.c | 1 +
kernel/sched/sched.h | 1 +
3 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3cca012..c5431af 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3613,6 +3613,18 @@ static inline void ttwu_do_wakeup(struct task_struct *p)
trace_sched_wakeup(p);
}
+void update_rq_avg_idle(struct rq *rq)
+{
+ u64 delta = rq_clock(rq) - rq->idle_stamp;
+ u64 max = 2*rq->max_idle_balance_cost;
+
+ update_avg(&rq->avg_idle, delta);
+
+ if (rq->avg_idle > max)
+ rq->avg_idle = max;
+ rq->idle_stamp = 0;
+}
+
static void
ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
struct rq_flags *rf)
@@ -3648,18 +3660,6 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
p->sched_class->task_woken(rq, p);
rq_repin_lock(rq, rf);
}
-
- if (rq->idle_stamp) {
- u64 delta = rq_clock(rq) - rq->idle_stamp;
- u64 max = 2*rq->max_idle_balance_cost;
-
- update_avg(&rq->avg_idle, delta);
-
- if (rq->avg_idle > max)
- rq->avg_idle = max;
-
- rq->idle_stamp = 0;
- }
}
/*
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 65eb8f8..aba5ad5 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -460,6 +460,7 @@ static void put_prev_task_idle(struct rq *rq, struct task_struct *prev, struct t
{
update_curr_idle(rq);
scx_update_idle(rq, false, true);
+ update_rq_avg_idle(rq);
}
static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool first)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 58c9d24..127633b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1670,6 +1670,7 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp)
#endif /* !CONFIG_FAIR_GROUP_SCHED */
+extern void update_rq_avg_idle(struct rq *rq);
extern void update_rq_clock(struct rq *rq);
/*
Powered by blists - more mailing lists