linux-kernel - Re: [PATCH v6 2/2] sched: update the rq->avg_idle when a task is moved to an idle CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ac29ecc0-13bc-4af4-b000-4846a40d9261@amperemail.onmicrosoft.com>
Date: Tue, 16 Dec 2025 17:49:00 +0800
From: Shijie Huang <shijie@...eremail.onmicrosoft.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Huang Shijie <shijie@...amperecomputing.com>, mingo@...hat.com,
 peterz@...radead.org, juri.lelli@...hat.com, patches@...erecomputing.com,
 cl@...ux.com, Shubhang@...amperecomputing.com, dietmar.eggemann@....com,
 rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
 linux-kernel@...r.kernel.org, vschneid@...hat.com, vineethr@...ux.ibm.com,
 kprateek.nayak@....com
Subject: Re: [PATCH v6 2/2] sched: update the rq->avg_idle when a task is
 moved to an idle CPU


On 16/12/2025 16:47, Vincent Guittot wrote:
> On Tue, 16 Dec 2025 at 08:39, Shijie Huang
> <shijie@...eremail.onmicrosoft.com> wrote:
>>
>> On 16/12/2025 15:17, Vincent Guittot wrote:
>>> On Tue, 16 Dec 2025 at 07:22, Shijie Huang
>>> <shijie@...eremail.onmicrosoft.com> wrote:
>>>> On 13/12/2025 09:36, Vincent Guittot wrote:
>>>>> put_prev_task_idle() would be a better place to call
>>>>> update_rq_avg_idle() because this is when we leave idle.
>>>> The update_rq_avg_idle() is not only called by current CPU, but also
>>>> called by
>>>>
>>>> other CPUs. For example, the try_to_wake_up(), update_rq_avg_idle() is
>>>> called by
>>>>
>>>> the other CPUs. So enqueue_task() is a good place.
>>> But put_prev_task_idle() is called by local CPU whenever it leaves
>>> idle so instead of trying to catch all places that could make the CPU
>>> leave idle it's better to use this single place.
>>> And as you mentioned, put_prev_task_idle is only called by local CPU
>>> whereas enqueue_task can be called by all CPUs creating useless
>>> pressure in the variable.
>> The rq->idle_stamp is set at sched_balance_newidle().  then we call
>> update_rq_avg_idle()
>>
>> in put_prev_task_idle() right now. How can we update the rq->avg_idle?
> I'm not sure I understand your point.
>
> rq->avg_idle tracks idle time. The easiest way would be to use
> - set_next_task_idle() when we enter idle
> - put_prev_task_idle() when we exit idle
>
> Except that sched_balance_newidle() can be long and the time should be
> accounted as idle time too. So instead of using set_next_task_idle(),
> we use sched_balance_newidle() to set . Which is okay because
> sched_balance_newidle() is always called before going to idle.

Thanks for the explanations.

It seems that put_prev_task_idle() is really a better place to call
update_rq_avg_idle(). Let me think it for a while :)


Thanks
Huang Shijie



>