[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3861092-71d3-4f36-8013-d721f60c1392@arm.com>
Date: Mon, 18 Aug 2025 16:24:20 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Xuewen Yan <xuewen.yan94@...il.com>
Cc: Christian Loehle <christian.loehle@....com>,
Xuewen Yan <xuewen.yan@...soc.com>, mingo@...hat.com, peterz@...radead.org,
juri.lelli@...hat.com, vincent.guittot@...aro.org, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
vdonnefort@...gle.com, ke.wang@...soc.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] sched/feec: Simplify the traversal of pd'cpus
On 18.08.25 12:05, Xuewen Yan wrote:
> On Fri, Aug 15, 2025 at 9:01 PM Dietmar Eggemann
> <dietmar.eggemann@....com> wrote:
>>
>> On 14.08.25 10:52, Xuewen Yan wrote:
>>> Hi Dietmar,
>>>
>>> On Thu, Aug 14, 2025 at 4:46 PM Dietmar Eggemann
>>> <dietmar.eggemann@....com> wrote:
>>>>
>>>> On 12.08.25 10:33, Xuewen Yan wrote:
>>
>> [...]
>>
>>>> Can you not mask cpus already early in the pd loop (1) and then profit
>>>> from (2) in these rare cases?
>>>
>>> I do not think the cpus_ptr chould place before the pd_cap calc,
>>> because the following scenario should be considered:
>>> the task's cpus_ptr cpus: 0,1,2,3
>>> pd's cpus: 0,1,2,3,4,5,6
>>> the pd's cap = cpu_cap * 6;
>>> if we cpumask_and(pd'scpus, p->cpus_ptr),
>>> the cpumask_weight = 4,
>>> the pd's cap = cpu_cap *4.
>>
>> Yes, you're right! Missed this one.
>>
>>>> IIRC, the sd only plays a role here in
>>>> exclusive cpusets scenarios which I don't thing anybody deploys with EAS?
>>>
>>> I am also wondering if the check for SD's CPUs could be removed...
>>
>> Still not 100% sure here. I would have to play with cpusets and EAS a
>> little bit more. Are you thinking that in those cases p->cpus_ptr
>> already covers the cpuset restriction so that the sd mask isn't necessary?
>
> I am not familiar with cpuset, so I can't guarantee this. Similarly, I
> also need to learn more about cpuset and cpu topology before I can
> answer this question.
Looks like we do need also the sd cpumask here.
Consider this tri-gear system:
# cat /sys/devices/system/cpu/cpu*/cpu_capacity
160
160
160
160
498
498
1024
1024
and 2 exclusive cpusets cs1={0-1,4,6} and cs2={2-3,5,7}, so EAS is
possible in all 3 root_domains (/, /cs1, /cs2):
...
[ 74.691104] CPU1 attaching sched-domain(s):
[ 74.691180] domain-0: span=0-1 level=MC
[ 74.691244] groups: 1:{ span=1 cap=159 }, 0:{ span=0 cap=155 }
[ 74.693453] domain-1: span=0-1,4,6 level=PKG
[ 74.693534] groups: 0:{ span=0-1 cap=314 }, 4:{ span=4 cap=496 },
6:{ span=6 cap=986 }
...
[ 74.697890] root domain span: 0-1,4,6
[ 74.697994] root_domain 2-3,5,7: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
[ 74.698922] root_domain 0-1,4,6: pd6:{ cpus=6-7 nr_pstate=4 } pd4:{
cpus=4-5 nr_pstate=4 } pd0:{ cpus=0-3 nr_pstate=4 }
sd = rcu_dereference(*this_cpu_ptr(&sd_asym_cpucapacity));
Tasks running in '/' only have the sd to reduce the CPU affinity correctly.
...
[001] 5290.935663: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
[001] 5290.935696: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=6-7 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935753: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=4-5 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
[001] 5290.935779: select_task_rq_fair: kworker/u33:3 382 prev_cpu=0
pd=0-3 online=0-7 sd=0-1,4,6 cpus_ptr=0-7
...
Powered by blists - more mailing lists