linux-kernel - Re: [PATCH 15/19] sched/fair: Respect LLC preference in task migration and detach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2c57d76f-fb31-4e1b-a3ce-ca13713e1b86@amd.com>
Date: Wed, 29 Oct 2025 09:24:02 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: "Chen, Yu C" <yu.c.chen@...el.com>
CC: Vincent Guittot <vincent.guittot@...aro.org>, Juri Lelli
	<juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, "Steven
 Rostedt" <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Madadi Vineeth
 Reddy" <vineethr@...ux.ibm.com>, Hillf Danton <hdanton@...a.com>, "Shrikanth
 Hegde" <sshegde@...ux.ibm.com>, Jianyong Wu <jianyong.wu@...look.com>, Yangyu
 Chen <cyy@...self.name>, Tingyin Duan <tingyin.duan@...il.com>, "Vern Hao"
	<vernhao@...cent.com>, Len Brown <len.brown@...el.com>, Aubrey Li
	<aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>, Chen Yu
	<yu.chen.surf@...il.com>, Adam Li <adamli@...amperecomputing.com>, Tim Chen
	<tim.c.chen@...el.com>, <linux-kernel@...r.kernel.org>, Tim Chen
	<tim.c.chen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>, "Gautham
 R . Shenoy" <gautham.shenoy@....com>, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH 15/19] sched/fair: Respect LLC preference in task
 migration and detach

Hello Chenyu,

On 10/28/2025 5:28 PM, Chen, Yu C wrote:
> Hi Prateek,
> 
> On 10/28/2025 2:02 PM, K Prateek Nayak wrote:
>> Hello Tim,
>>
>> On 10/11/2025 11:54 PM, Tim Chen wrote:
>>> @@ -9969,6 +9969,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>>>       if (env->flags & LBF_ACTIVE_LB)
>>>           return 1;
>>>   +#ifdef CONFIG_SCHED_CACHE
>>> +    if (sched_cache_enabled() &&
>>> +        can_migrate_llc_task(env->src_cpu, env->dst_cpu, p) == mig_forbid)
>>> +        return 0;
>>> +#endif
>>> +
>>>       degrades = migrate_degrades_locality(p, env);
>>>       if (!degrades)
>>>           hot = task_hot(p, env);
>>
>> Should we care for task_hot() w.r.t. migration cost if a task is being
>> moved to a preferred LLC?
>>
> 
> This is a good question. The decision not to migrate a task when its
> LLC preference is violated takes priority over the check in task_hot().
> 
> The main reason is that we want cache aware aggregation to be more
> aggressive than generic migration; otherwise, cache-aware migration
>  might not take effect according to our previous test. This seems to
> be a trade-off. Another consideration might be: should we consider
> the occupancy of a single thread or that of the entire process?
> For example, suppose t0, t1, and t2 belong to the same process. t0
> and t1 are running on the process's preferred LLC0, while t2 is
> running on the non-preferred LLC1. Even though t2 has high occupancy
> on LLC1 (making it cache-hot on LLC1), we might still want to move t2
> to LLC0 if t0, t1, and t2 read from and write to each other - since we don't want to generate cross-LLC access.

Makes sense. That would need some heuristics based on the avg_running
to know which LLC can be be a potential target with fewest migrations.
But then again, in a dynamic system things change so quickly - what
you have now seems to be a good start to further optimize on top of.

> 
>> Also, should we leave out tasks under core scheduling from the llc
>> aware lb? Even discount them when calculating "mm->nr_running_avg"?
>>
> Yes, it seems that the cookie match check case was missed, which is
> embedded in task_hot(). I suppose you are referring to the p->core_cookie
> check; I'll look into this direction.

Yup! I think if user has opted into core scheduling, they should ideally
not bother about cache aware scheduling.

> 
>>> @@ -10227,6 +10233,20 @@ static int detach_tasks(struct lb_env *env)
>>>           if (env->imbalance <= 0)
>>>               break;
>>>   +#ifdef CONFIG_SCHED_CACHE
>>> +        /*
>>> +         * Don't detach more tasks if the remaining tasks want
>>> +         * to stay. We know the remaining tasks all prefer the
>>> +         * current LLC, because after order_tasks_by_llc(), the
>>> +         * tasks that prefer the current LLC are at the tail of
>>> +         * the list. The inhibition of detachment is to avoid too
>>> +         * many tasks being migrated out of the preferred LLC.
>>> +         */
>>> +        if (sched_cache_enabled() && detached && p->preferred_llc != -1 &&
>>> +            llc_id(env->src_cpu) == p->preferred_llc)
>>> +            break;
>>
>> In all cases? Should we check can_migrate_llc() wrt to util migrated and
>> then make a call if we should move the preferred LLC tasks or not?
>>
> 
> Prior to this "stop of detaching tasks", we performed a can_migrate_task(p)
> to determine if the detached p is dequeued from its preferred LLC, and in
> can_migrate_task(), we use can_migrate_llc_task() -> can_migrate_llc() to
> carry out the check. That is to say, only when certain tasks have been
> detached, will we stop further detaching.
> 
>> Perhaps disallow it the first time if "nr_balance_failed" is 0 but
>> subsequent failed attempts should perhaps explore breaking the preferred
>> llc restriction if there is an imbalance and we are under
>> "mig_unrestricted" conditions.
>>
> 
> I suppose you are suggesting that the threshold for stopping task detachment
> should be higher. With the above can_migrate_llc() check, I suppose we have
> raised the threshold for stopping "task detachment"?

Say the LLC is under heavy load and we only have overloaded groups.
can_migrate_llc() would return "mig_unrestricted" since
fits_llc_capacity() would return false.

Since we are under "migrate_load", sched_balance_find_src_rq() has
returned the CPU with the highest load which could very well be the
CPU with with a large number of preferred LLC tasks.

sched_cache_enabled() is still true and when detach_tasks() reaches
one of these preferred llc tasks (which comes at the very end of the
tasks list), we break out even if env->imbalance > 0 leaving
potential imbalance for the "migrate_load" case.

Instead, we can account for the util moved out of the src_llc and
after accounting for it, check if can_migrate_llc() would return
"mig_forbid" for the src llc.

-- 
Thanks and Regards,
Prateek