lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a5eff5e-4184-958a-17d8-b551e7efc784@arm.com>
Date:   Thu, 6 Jul 2023 13:11:15 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Tobias Huschle <huschle@...ux.ibm.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        vschneid@...hat.com, sshegde@...ux.vnet.ibm.com,
        srikar@...ux.vnet.ibm.com, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [RFC 0/1] sched/fair: Consider asymmetric scheduler groups in
 load balancer

On 04/07/2023 11:11, Tobias Huschle wrote:
> On 2023-05-16 18:35, Dietmar Eggemann wrote:
>> On 15/05/2023 13:46, Tobias Huschle wrote:
>>> The current load balancer implementation implies that scheduler groups,
>>> within the same scheduler domain, all host the same number of CPUs.
>>>
>>> This appears to be valid for non-s390 architectures. Nevertheless, s390
>>> can actually have scheduler groups of unequal size.
>>
>> Arm (classical) big.Little had this for years before we switched to flat
>> scheduling (only MC sched domain) over CPU capacity boundaries for Arm
>> DynamIQ.
>>
>> Arm64 Juno platform in mainline:
>>
>> root@...o:~# cat /sys/devices/system/cpu/cpu*/topology/cluster_cpus_list
>> 0,3-5
>> 1-2
>> 1-2
>> 0,3-5
>> 0,3-5
>> 0,3-5
>>
>> root@...o:~# cat /proc/schedstat | grep ^domain | awk '{print $1, $2}'
>>
>> domain0 39 <--
>> domain1 3f
>> domain0 06 <--
>> domain1 3f
>> domain0 06
>> domain1 3f
>> domain0 39
>> domain1 3f
>> domain0 39
>> domain1 3f
>> domain0 39
>> domain1 3f
>>
>> root@...o:~# cat /sys/kernel/debug/sched/domains/cpu0/domain*/name
>> MC
>> DIE
>>
>> But we don't have SMT on the mobile processors.
>>
>> It looks like you are only interested to get group_weight dependency
>> into this 'prefer_sibling' condition of find_busiest_group()?
>>
> Sorry, looks like your reply hit some bad filter of my mail program.
> Let me answer, although it's a bit late.
> 
> Yes, I would like to get the group_weight into the prefer_sibling path.
> Unfortunately, we cannot go for a flat hierarchy as the s390 hardware
> allows to have CPUs to be pretty far apart (cache-wise), which means,
> the load balancer should avoid to move tasks back and forth between
> those CPUs if possible.
> 
> We can't remove SD_PREFER_SIBLING either, as this would cause the load
> balancer to aim for having the same number of idle CPUs in all groups,
> which is a problem as well in asymmetric groups, for example:
> 
> With SD_PREFER_SIBLING, aiming for same number of non-idle CPUs
> 00 01 02 03 04 05 06 07 08 09 10 11  || 12 13 14 15
>                 x     x     x     x      x  x  x  x
> 
> Without SD_PREFER_SIBLING, aiming for the same number of idle CPUs
> 00 01 02 03 04 05 06 07 08 09 10 11  || 12 13 14 15
>     x  x  x     x  x     x     x  x
> 
> 
> Hence the idea to add the group_weight to the prefer_sibling path.
> 
> I was wondering if this would be the right place to address this issue
> or if I should go down another route.

Yes, it's the right place to fix it for you. IMHO, there is still some
discussion needed about the correct condition and changes in
calculate_imbalance() for your case if I read the comments on this
thread correctly.

Arm64 big.Little wouldn't be affected since we explicitly remove
SD_PREFER_SIBLING on MC for our legacy MC,DIE setups to avoid spreading
tasks across DIE sched groups holding CPUs with different capacities.

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ