[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtA-S8syqMWLX6Hrf7sndMfbtGD9UEPX9G+0w2-tgto-6g@mail.gmail.com>
Date: Tue, 16 Dec 2025 09:47:40 +0100
From: Vincent Guittot <vincent.guittot@...aro.org>
To: Shijie Huang <shijie@...eremail.onmicrosoft.com>
Cc: Huang Shijie <shijie@...amperecomputing.com>, mingo@...hat.com, peterz@...radead.org,
juri.lelli@...hat.com, patches@...erecomputing.com, cl@...ux.com,
Shubhang@...amperecomputing.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
linux-kernel@...r.kernel.org, vschneid@...hat.com, vineethr@...ux.ibm.com,
kprateek.nayak@....com
Subject: Re: [PATCH v6 2/2] sched: update the rq->avg_idle when a task is
moved to an idle CPU
On Tue, 16 Dec 2025 at 08:39, Shijie Huang
<shijie@...eremail.onmicrosoft.com> wrote:
>
>
> On 16/12/2025 15:17, Vincent Guittot wrote:
> > On Tue, 16 Dec 2025 at 07:22, Shijie Huang
> > <shijie@...eremail.onmicrosoft.com> wrote:
> >>
> >> On 13/12/2025 09:36, Vincent Guittot wrote:
> >>> put_prev_task_idle() would be a better place to call
> >>> update_rq_avg_idle() because this is when we leave idle.
> >> The update_rq_avg_idle() is not only called by current CPU, but also
> >> called by
> >>
> >> other CPUs. For example, the try_to_wake_up(), update_rq_avg_idle() is
> >> called by
> >>
> >> the other CPUs. So enqueue_task() is a good place.
> > But put_prev_task_idle() is called by local CPU whenever it leaves
> > idle so instead of trying to catch all places that could make the CPU
> > leave idle it's better to use this single place.
> > And as you mentioned, put_prev_task_idle is only called by local CPU
> > whereas enqueue_task can be called by all CPUs creating useless
> > pressure in the variable.
>
> The rq->idle_stamp is set at sched_balance_newidle(). then we call
> update_rq_avg_idle()
>
> in put_prev_task_idle() right now. How can we update the rq->avg_idle?
I'm not sure I understand your point.
rq->avg_idle tracks idle time. The easiest way would be to use
- set_next_task_idle() when we enter idle
- put_prev_task_idle() when we exit idle
Except that sched_balance_newidle() can be long and the time should be
accounted as idle time too. So instead of using set_next_task_idle(),
we use sched_balance_newidle() to set . Which is okay because
sched_balance_newidle() is always called before going to idle.
>
> Thanks
>
> Huang Shijie
>
> >>
Powered by blists - more mailing lists