lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4560ffb7eab64f7b9c6f7eb2ad7430827e19f849.camel@linux.intel.com>
Date: Tue, 28 Oct 2025 08:30:26 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: "Chen, Yu C" <yu.c.chen@...el.com>, K Prateek Nayak
 <kprateek.nayak@....com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>, Juri Lelli	
 <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel
 Gorman <mgorman@...e.de>,  Valentin Schneider	 <vschneid@...hat.com>,
 Madadi Vineeth Reddy <vineethr@...ux.ibm.com>, Hillf Danton
 <hdanton@...a.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>, Jianyong Wu	
 <jianyong.wu@...look.com>, Yangyu Chen <cyy@...self.name>, Tingyin Duan	
 <tingyin.duan@...il.com>, Vern Hao <vernhao@...cent.com>, Len Brown	
 <len.brown@...el.com>, Aubrey Li <aubrey.li@...el.com>, Zhao Liu	
 <zhao1.liu@...el.com>, Chen Yu <yu.chen.surf@...il.com>, Adam Li	
 <adamli@...amperecomputing.com>, Tim Chen <tim.c.chen@...el.com>, 
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
 "Gautham R . Shenoy" <gautham.shenoy@....com>, Ingo Molnar
 <mingo@...hat.com>
Subject: Re: [PATCH 15/19] sched/fair: Respect LLC preference in task
 migration and detach

On Tue, 2025-10-28 at 19:58 +0800, Chen, Yu C wrote:
> Hi Prateek,
> 
> On 10/28/2025 2:02 PM, K Prateek Nayak wrote:
> > Hello Tim,
> > 
> > On 10/11/2025 11:54 PM, Tim Chen wrote:
> > > @@ -9969,6 +9969,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> > >   	if (env->flags & LBF_ACTIVE_LB)
> > >   		return 1;
> > >   
> > > +#ifdef CONFIG_SCHED_CACHE
> > > +	if (sched_cache_enabled() &&
> > > +	    can_migrate_llc_task(env->src_cpu, env->dst_cpu, p) == mig_forbid)
> > > +		return 0;
> > > +#endif
> > > +
> > >   	degrades = migrate_degrades_locality(p, env);
> > >   	if (!degrades)
> > >   		hot = task_hot(p, env);
> > 
> > Should we care for task_hot() w.r.t. migration cost if a task is being
> > moved to a preferred LLC?
> > 
> 
> This is a good question. The decision not to migrate a task when its
> LLC preference is violated takes priority over the check in task_hot().
> 
> The main reason is that we want cache aware aggregation to be more
> aggressive than generic migration; otherwise, cache-aware migration
>   might not take effect according to our previous test. This seems to
> be a trade-off. Another consideration might be: should we consider
> the occupancy of a single thread or that of the entire process?
> For example, suppose t0, t1, and t2 belong to the same process. t0
> and t1 are running on the process's preferred LLC0, while t2 is
> running on the non-preferred LLC1. Even though t2 has high occupancy
> on LLC1 (making it cache-hot on LLC1), we might still want to move t2
> to LLC0 if t0, t1, and t2 read from and write to each other - since we 
> don't want to generate cross-LLC access.
> 
> > Also, should we leave out tasks under core scheduling from the llc
> > aware lb? Even discount them when calculating "mm->nr_running_avg"?
> > 
> Yes, it seems that the cookie match check case was missed, which is
> embedded in task_hot(). I suppose you are referring to the p->core_cookie
> check; I'll look into this direction.
> 
> > > @@ -10227,6 +10233,20 @@ static int detach_tasks(struct lb_env *env)
> > >   		if (env->imbalance <= 0)
> > >   			break;
> > >   
> > > +#ifdef CONFIG_SCHED_CACHE
> > > +		/*
> > > +		 * Don't detach more tasks if the remaining tasks want
> > > +		 * to stay. We know the remaining tasks all prefer the
> > > +		 * current LLC, because after order_tasks_by_llc(), the
> > > +		 * tasks that prefer the current LLC are at the tail of
> > > +		 * the list. The inhibition of detachment is to avoid too
> > > +		 * many tasks being migrated out of the preferred LLC.
> > > +		 */
> > > +		if (sched_cache_enabled() && detached && p->preferred_llc != -1 &&
> > > +		    llc_id(env->src_cpu) == p->preferred_llc)
> > > +			break;
> > 
> > In all cases? Should we check can_migrate_llc() wrt to util migrated and
> > then make a call if we should move the preferred LLC tasks or not?
> > 
> 
> Prior to this "stop of detaching tasks", we performed a can_migrate_task(p)
> to determine if the detached p is dequeued from its preferred LLC, and in
> can_migrate_task(), we use can_migrate_llc_task() -> can_migrate_llc() to
> carry out the check. That is to say, only when certain tasks have been
> detached, will we stop further detaching.
> 
> > Perhaps disallow it the first time if "nr_balance_failed" is 0 but
> > subsequent failed attempts should perhaps explore breaking the preferred
> > llc restriction if there is an imbalance and we are under
> > "mig_unrestricted" conditions.
> > 
> 

Pratek,

We have to actually allow for imbalance between LLCs with task
aggregation.

Say we have 2 LLCs and only one process running. Suppose all tasks in the process
can fit in one LLC and not overload it. Then we should not pull tasks from
the preferred LLC, and allow the imbalance. If we balance the tasks the
second time around, that will defeat the purpose.

That's why we have the knob llc_overload_pct (50%), which will start spreading
tasks to non-preferred LLC once load in preferred LLC excees 50%.
And llc_imb_pct(20%), which allows for a 20% higher load between preferred LLC
and non-preferred LLC if the preferred LLC is operating above 50%.

So if we ignore the LLC policy totally the second time around, we may be breaking
LLC aggregation and have tasks be moved to their non-preferred LLC.

Will take a closer look to see if nr_balance_failed > 0
because we cannot move tasks to their preferred LLC repeatedly, and if
we should do anything different to balance tasks better without violating
LLC preference.

Tim

> I suppose you are suggesting that the threshold for stopping task 
> detachment
> should be higher. With the above can_migrate_llc() check, I suppose we have
> raised the threshold for stopping "task detachment"?
> 
> thanks,
> Chenyu
> 
> > > +#endif
> > > +
> > >   		continue;
> > >   next:
> > >   		if (p->sched_task_hot)
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ