[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251203053056.4128-2-shijie@os.amperecomputing.com>
Date: Wed, 3 Dec 2025 13:30:55 +0800
From: Huang Shijie <shijie@...amperecomputing.com>
To: mingo@...hat.com,
peterz@...radead.org,
juri.lelli@...hat.com,
vincent.guittot@...aro.org
Cc: patches@...erecomputing.com,
cl@...ux.com,
Shubhang@...amperecomputing.com,
dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
linux-kernel@...r.kernel.org,
vschneid@...hat.com,
vineethr@...ux.ibm.com,
kprateek.nayak@....com,
Huang Shijie <shijie@...amperecomputing.com>
Subject: [PATCH v5 1/2] sched/fair: set rq->idle_stamp at the end of the sched_balance_newidle
In current newidle balance, the rq->idle_stamp may set to a non-zero value
if it cannot pull any task.
In the wakeup, it will detect the rq->idle_stamp, and updates
the rq->avg_idle, then ends the CPU idle status by setting rq->idle_stamp
to zero.
Besides the wakeup, current code does not end the CPU idle status
when a task is moved to the idle CPU, such as fork/clone, execve,
or other cases.
In order to fix this issue, we want to add a hook(update_rq_avg_idle())
in the enqueue_task(). With this hook, if a task is moved to the idle CPU,
it will update the rq->avg_idle. Unfortunately, this hook is also called
in the newidle balance:
sched_balance_newidle() --> sched_balance_rq() --> .. --> enqueue_task()
If we still set rq->idle_stamp at the beginning of sched_balance_newidle(),
the rq->avg_idle will not be updated correctly.
In order to make it work correctly, save the idle_stamp at the beginning
of sched_balance_newidle(). If newidle balance cannot pull any task,
set the saved value for rq->idle_stamp. With this method,
the newidle balance still work correctly, and the hook in enqueue_task()
also works correctly.
Signed-off-by: Huang Shijie <shijie@...amperecomputing.com>
---
kernel/sched/fair.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1855975b8248..cfdd22e5dcab 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12865,6 +12865,7 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)
u64 t0, t1, curr_cost = 0;
struct sched_domain *sd;
int pulled_task = 0;
+ u64 idle_stamp;
update_misfit_status(NULL, this_rq);
@@ -12880,7 +12881,9 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)
* for CPU_NEWLY_IDLE, such that we measure the this duration
* as idle time.
*/
- this_rq->idle_stamp = rq_clock(this_rq);
+ idle_stamp = rq_clock(this_rq);
+
+ this_rq->idle_stamp = 0;
/*
* Do not pull tasks towards !active CPUs...
@@ -12992,10 +12995,11 @@ static int sched_balance_newidle(struct rq *this_rq, struct rq_flags *rf)
if (time_after(this_rq->next_balance, next_balance))
this_rq->next_balance = next_balance;
- if (pulled_task)
- this_rq->idle_stamp = 0;
- else
+ if (!pulled_task) {
+ /* Set it here on purpose. */
+ this_rq->idle_stamp = idle_stamp;
nohz_newidle_balance(this_rq);
+ }
rq_repin_lock(this_rq, rf);
--
2.40.1
Powered by blists - more mailing lists