lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d02f00b0-75ca-483d-85c9-82269cf70072@amd.com>
Date: Fri, 9 Jan 2026 16:06:45 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Valentin Schneider <vschneid@...hat.com>, Huang Shijie
	<shijie8@...il.com>, <mingo@...hat.com>, <peterz@...radead.org>,
	<vincent.guittot@...aro.org>
CC: <dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
	<mgorman@...e.de>, <linux-kernel@...r.kernel.org>, <vineethr@...ux.ibm.com>,
	<cl@...ux.com>
Subject: Re: [PATCH v7 1/1] sched: update the rq->avg_idle when a task is
 moved to an idle CPU

Hello Valentin,

On 1/9/2026 2:42 PM, Valentin Schneider wrote:
> On 26/12/25 14:32, Huang Shijie wrote:
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -3609,6 +3609,21 @@ static inline void ttwu_do_wakeup(struct task_struct *p)
>>       trace_sched_wakeup(p);
>>  }
>>
>> +void update_rq_avg_idle(struct rq *rq)
>> +{
>> +	if (rq->idle_stamp) {
>> +		u64 delta = rq_clock(rq) - rq->idle_stamp;
>> +		u64 max = 2*rq->max_idle_balance_cost;
>> +
>> +		update_avg(&rq->avg_idle, delta);
>> +
>> +		if (rq->avg_idle > max)
>> +			rq->avg_idle = max;
>> +
>> +		rq->idle_stamp = 0;
>> +	}
>> +}
>> +
> 
> So if we have this invoked every time we switch to the idle task via
> put_prev_task_idle(), do we want to move sched_balance_newidle()'s update
> of rq->idle_stamp() to set_next_task_idle()?
> > That does change the behaviour as we'd now record any idle duration as
> opposed to only idle-from-fair duration, but that would mean we'd
> unconditionally record a rq->idle_stamp and could thus ditch the if{} clause.

So I'm a wee bit skeptical of this - the avg_idle also serves as a
bailout for newidle_balance(). If a tasks keeps waking up during newidle
balance, we would like to discourage further attempts of newidle balance
for a while to avoid CPU being stuck doing newidle balance while having
runnable tasks waken up on it.

There is no bailout past should_we_balance(), and for large domains, it
can take a while to get out of balancing.

If we move this to {put_prev,set_next}_task_idle(), we'll completely
fail to capture that part of newidle balance bailout and I'm afraid
we'll start doing newidle balance more aggressively.

I'll get some data over the weekend for the different variants being
discussed here - if it doesn't reveal anything drastic, we can
consider moving this accounting to idle task's switch.

-- 
Thanks and Regards,
Prateek


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ