[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91cbbdb7-53fd-483a-8bc3-cd4e63713485@intel.com>
Date: Wed, 29 Oct 2025 22:23:43 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: K Prateek Nayak <kprateek.nayak@....com>
CC: Vincent Guittot <vincent.guittot@...aro.org>, Juri Lelli
	<juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>, "Steven
 Rostedt" <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman
	<mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, "Madadi Vineeth
 Reddy" <vineethr@...ux.ibm.com>, Hillf Danton <hdanton@...a.com>, "Shrikanth
 Hegde" <sshegde@...ux.ibm.com>, Jianyong Wu <jianyong.wu@...look.com>,
	"Yangyu Chen" <cyy@...self.name>, Tingyin Duan <tingyin.duan@...il.com>, Vern
 Hao <vernhao@...cent.com>, Len Brown <len.brown@...el.com>, Aubrey Li
	<aubrey.li@...el.com>, Zhao Liu <zhao1.liu@...el.com>, Chen Yu
	<yu.chen.surf@...il.com>, Adam Li <adamli@...amperecomputing.com>, Tim Chen
	<tim.c.chen@...el.com>, <linux-kernel@...r.kernel.org>, Tim Chen
	<tim.c.chen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>, "Gautham
 R . Shenoy" <gautham.shenoy@....com>, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH 15/19] sched/fair: Respect LLC preference in task
 migration and detach
On 10/29/2025 11:54 AM, K Prateek Nayak wrote:
[snip]
>>>> @@ -10227,6 +10233,20 @@ static int detach_tasks(struct lb_env *env)
>>>>            if (env->imbalance <= 0)
>>>>                break;
>>>>    +#ifdef CONFIG_SCHED_CACHE
>>>> +        /*
>>>> +         * Don't detach more tasks if the remaining tasks want
>>>> +         * to stay. We know the remaining tasks all prefer the
>>>> +         * current LLC, because after order_tasks_by_llc(), the
>>>> +         * tasks that prefer the current LLC are at the tail of
>>>> +         * the list. The inhibition of detachment is to avoid too
>>>> +         * many tasks being migrated out of the preferred LLC.
>>>> +         */
>>>> +        if (sched_cache_enabled() && detached && p->preferred_llc != -1 &&
>>>> +            llc_id(env->src_cpu) == p->preferred_llc)
>>>> +            break;
>>>
>>> In all cases? Should we check can_migrate_llc() wrt to util migrated and
>>> then make a call if we should move the preferred LLC tasks or not?
>>>
>>
>> Prior to this "stop of detaching tasks", we performed a can_migrate_task(p)
>> to determine if the detached p is dequeued from its preferred LLC, and in
>> can_migrate_task(), we use can_migrate_llc_task() -> can_migrate_llc() to
>> carry out the check. That is to say, only when certain tasks have been
>> detached, will we stop further detaching.
>>
>>> Perhaps disallow it the first time if "nr_balance_failed" is 0 but
>>> subsequent failed attempts should perhaps explore breaking the preferred
>>> llc restriction if there is an imbalance and we are under
>>> "mig_unrestricted" conditions.
>>>
>>
>> I suppose you are suggesting that the threshold for stopping task detachment
>> should be higher. With the above can_migrate_llc() check, I suppose we have
>> raised the threshold for stopping "task detachment"?
> 
> Say the LLC is under heavy load and we only have overloaded groups.
> can_migrate_llc() would return "mig_unrestricted" since
> fits_llc_capacity() would return false.
> 
> Since we are under "migrate_load", sched_balance_find_src_rq() has
> returned the CPU with the highest load which could very well be the
> CPU with with a large number of preferred LLC tasks.
> 
> sched_cache_enabled() is still true and when detach_tasks() reaches
> one of these preferred llc tasks (which comes at the very end of the
> tasks list), we break out even if env->imbalance > 0 leaving
> potential imbalance for the "migrate_load" case.
> 
> Instead, we can account for the util moved out of the src_llc and
> after accounting for it, check if can_migrate_llc() would return
> "mig_forbid" for the src llc.
> 
I see your point, the original decision matrix intends to
spread the tasks when both LLCs are overloaded.
(src is the preferred LLC, dst is non-preferred LLC)
src \ dst      30%  40%  50%  60%
30%            N    N    N    N
40%            N    N    N    N
50%            N    N    G    G
60%            Y    N    G    G
  src :      src_util
  dst :      dst_util
  Y :        Yes, migrate
  N :        No, do not migrate
  G :        let the Generic load balance to even the load.
I suppose the reason why the code breaks the rule here is because
as Tim mentioned in another thread, to inhibit the task bouncing
between LLCs.
thanks,
Chenyu
Powered by blists - more mailing lists
 
