lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96d7c406-95dc-43a2-9daf-819b78979c75@amd.com>
Date: Thu, 10 Apr 2025 21:07:44 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>, <linux-kernel@...r.kernel.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt
	<rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Gautham R.
 Shenoy" <gautham.shenoy@....com>, Swapnil Sapkal <swapnil.sapkal@....com>
Subject: Re: [RFC PATCH 5/5] sched/fair: Proactive idle balance using push
 mechanism

On 4/10/2025 3:59 PM, Peter Zijlstra wrote:

[..snip..]

>>   /*
>>    * See if the non running fair tasks on this rq can be sent on other CPUs
>>    * that fits better with their profile.
>>    */
>>   static bool push_fair_task(struct rq *rq)
>>   {
>> +	struct cpumask *cpus = this_cpu_cpumask_var_ptr(load_balance_mask);
>> +	struct task_struct *p = pick_next_pushable_fair_task(rq);
>> +	int cpu, this_cpu = cpu_of(rq);
>> +
>> +	if (!p)
>> +		return false;
>> +
>> +	if (!cpumask_and(cpus, nohz.idle_cpus_mask, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE)))
>> +		goto requeue;
> 
> So I think the main goal here should be to get rid of the whole single
> nohz balancing thing.
> 
> This global state/mask has been shown to be a problem over and over again.
> 
> Ideally we keep a nohz idle mask per LLC (right next to the overload
> mask you introduced earlier), along with a bit in the sched_domain tree
> upwards of that to indicate a particular llc/ node / distance-group has
> nohz idle.
> 
> Then if the topmost domain has the bit set it means there are nohz cpus
> to be found, and we can (slowly) iterate the domain tree up from
> overloaded LLC to push tasks around.

I'll to through fair.c to understand all the usecases of
"nohz.idle_cpus_mask" and then start with this bit for v2 to see if that
blows up in some way. I'll be back shortly.

> 
> Anyway, yes, you gotta start somewhere :-)

Thanks a ton for the initial review. I'll go analyze more to see what
bits are making benchmarks go sad.

-- 
Thanks and Regards,
Prateek


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ