[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtABRgxP4=7f-rNOyh0UG=tBggsT5PObjvrQB8K-f66BYA@mail.gmail.com>
Date: Sat, 13 Dec 2025 02:36:24 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Huang Shijie <shijie@...amperecomputing.com>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
patches@...erecomputing.com, cl@...ux.com, Shubhang@...amperecomputing.com,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, linux-kernel@...r.kernel.org, vschneid@...hat.com,
vineethr@...ux.ibm.com, kprateek.nayak@....com
Subject: Re: [PATCH v6 2/2] sched: update the rq->avg_idle when a task is
moved to an idle CPU
On Tue, 9 Dec 2025 at 10:46, Huang Shijie <shijie@...amperecomputing.com> wrote:
>
> In the newidle balance, the rq->idle_stamp may set to a non-zero value
> if it cannot pull any task.
>
> In the wakeup, it will detect the rq->idle_stamp, and updates
> the rq->avg_idle, then ends the CPU idle status by setting rq->idle_stamp
> to zero.
>
> Besides the wakeup, current code does not end the CPU idle status
> when a task is moved to the idle CPU, such as fork/clone, execve,
> or other cases. In order to get more accurate rq->avg_idle,
> we need to update it at more places(not only the wakeup).
>
> This patch introduces a helper: update_rq_avg_idle().
> And uses it in enqueue_task(), so it will update the rq->avg_idle
> when a task is moved to an idle CPU at:
> -- wakeup
> -- fork/clone
> -- execve
> -- idle balance
> -- other cases
>
> Signed-off-by: Huang Shijie <shijie@...amperecomputing.com>
> ---
> kernel/sched/core.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9f10cfbdc228..2e3c4043de51 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2078,6 +2078,21 @@ unsigned long get_wchan(struct task_struct *p)
> return ip;
> }
>
> +static void update_rq_avg_idle(struct rq *rq)
> +{
> + if (rq->idle_stamp) {
> + u64 delta = rq_clock(rq) - rq->idle_stamp;
> + u64 max = 2*rq->max_idle_balance_cost;
> +
> + update_avg(&rq->avg_idle, delta);
> +
> + if (rq->avg_idle > max)
> + rq->avg_idle = max;
> +
> + rq->idle_stamp = 0;
> + }
> +}
> +
> void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
> {
> if (!(flags & ENQUEUE_NOCLOCK))
> @@ -2100,6 +2115,8 @@ void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
>
> if (sched_core_enabled(rq))
> sched_core_enqueue(rq, p);
> +
> + update_rq_avg_idle(rq);
put_prev_task_idle() would be a better place to call
update_rq_avg_idle() because this is when we leave idle.
> }
>
> /*
> @@ -3645,18 +3662,6 @@ ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags,
> p->sched_class->task_woken(rq, p);
> rq_repin_lock(rq, rf);
> }
> -
> - if (rq->idle_stamp) {
> - u64 delta = rq_clock(rq) - rq->idle_stamp;
> - u64 max = 2*rq->max_idle_balance_cost;
> -
> - update_avg(&rq->avg_idle, delta);
> -
> - if (rq->avg_idle > max)
> - rq->avg_idle = max;
> -
> - rq->idle_stamp = 0;
> - }
> }
>
> /*
> --
> 2.40.1
>
Powered by blists - more mailing lists