linux-kernel - Re: [RFC PATCH v4 07/28] sched: Add helper function to decide whether to allow cache aware scheduling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <881eeb68d4e0711bdf73a7fe27cc29d9cae60321.camel@linux.intel.com>
Date: Thu, 02 Oct 2025 10:46:25 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>, "Chen, Yu C" <yu.c.chen@...el.com>
Cc: Ingo Molnar <mingo@...hat.com>, K Prateek Nayak
 <kprateek.nayak@....com>,  "Gautham R . Shenoy" <gautham.shenoy@....com>,
 Vincent Guittot <vincent.guittot@...aro.org>, Juri Lelli	
 <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel
 Gorman <mgorman@...e.de>,  Valentin Schneider	 <vschneid@...hat.com>, Libo
 Chen <libo.chen@...cle.com>, Madadi Vineeth Reddy	
 <vineethr@...ux.ibm.com>, Hillf Danton <hdanton@...a.com>, Shrikanth Hegde	
 <sshegde@...ux.ibm.com>, Jianyong Wu <jianyong.wu@...look.com>, Yangyu Chen
	 <cyy@...self.name>, Tingyin Duan <tingyin.duan@...il.com>, Vern Hao	
 <vernhao@...cent.com>, Len Brown <len.brown@...el.com>, Aubrey Li	
 <aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>, Chen Yu	
 <yu.chen.surf@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v4 07/28] sched: Add helper function to decide
 whether to allow cache aware scheduling

On Thu, 2025-10-02 at 13:50 +0200, Peter Zijlstra wrote:
> On Thu, Oct 02, 2025 at 07:31:40PM +0800, Chen, Yu C wrote:
> > On 10/1/2025 9:17 PM, Peter Zijlstra wrote:
> > > On Sat, Aug 09, 2025 at 01:03:10PM +0800, Chen Yu wrote:
> > > > From: Tim Chen <tim.c.chen@...ux.intel.com>
> > > > 
> > > > Cache-aware scheduling is designed to aggregate threads into their
> > > > preferred LLC, either via the task wake up path or the load balancing
> > > > path. One side effect is that when the preferred LLC is saturated,
> > > > more threads will continue to be stacked on it, degrading the workload's
> > > > latency. A strategy is needed to prevent this aggregation from going too
> > > > far such that the preferred LLC is too overloaded.
> > > 
> > > So one of the ideas was to extend the preferred llc number to a mask.
> > > Update the preferred mask with (nr_threads / llc_size) bits, indicating
> > > the that many top llc as sorted by occupancy.
> > > 
> > > 
> > 
> > Having more than one preferred LLC helps prevent aggregation from going
> > too far on a single preferred LLC.
> > 
> > One question would be: if one LLC cannot hold all the threads of a process,
> > does a second preferred LLC help in this use case? Currently, this patch
> > gives up task aggregation and falls back to legacy load balancing if the
> > preferred LLC is overloaded. If we place threads across two preferred LLCs,
> > these threads might encounter cross-LLC latency anyway - so we may as well
> > let
> > legacy load balancing spread them out IMO.
> 
> Well, being stuck on 2 LLCs instead of being spread across 10 still
> seems like a win, no?
> 
> Remember, our friends at AMD have *MANY* LLCs.
> 
> > Another issue that Patch 7 tries to address is avoiding task
> > bouncing between preferred LLCs and non-preferred LLCs. If we
> > introduce a preferred LLC priority list, logic to prevent task
> > bouncing between different preferred LLCs might be needed in
> > load balancing, which could become complicated. 
> 
> It doesn't really become more difficult to tell preferred LLC from
> non-preferred LLC with a asm. So why should things get more complicatd?
> 

For secondary and maybe tertiary LLCs to work well, the
ordering of the occupancy between the LLCs have to be
relatively stable. Or else we could have many
tasks migration between the LLCs when the ordering change.
Frequent task migrations could be worse for performance.

>From previous experiments, we saw that the occupancy could
have some fairly big fluctuations.  That's the reason 
we set the preferred LLC threshold to be high (2x).
We want to be sure before jerking tasks around to a new LLC.

With the secondary, tertiary LLCs, LLC ordering would change
more frequently than having just a single preferred LLC.
The secondary and tertiary LLCs have fewer tasks/mm and
occupancy could fluctuate more.
One concern is this could lead to extra task migrations
that could negate any cache consolidation benefits gained.

Tim

> 
> Anyway, it was just one of the 'random' ideas I had kicking about.
> Reality always ruins things, *shrug* :-)