lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SL2PR06MB3082EA14946F2E207B3043E2BDFA9@SL2PR06MB3082.apcprd06.prod.outlook.com>
Date:   Wed, 27 Apr 2022 02:18:37 +0000
From:   王擎 <wangqing@...o.com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>,
        Sudeep Holla <sudeep.holla@....com>
CC:     Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "peterz@...radead.org" <peterz@...radead.org>
Subject: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI


>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>
>>>>>>>>> From: Wang Qing <wangqing@...o.com>
>>>>>>>>>
>>>>>>>>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>>>>>>>>> which is set by cluster_id, cluster_id can only get from ACPI.
>>>>>>>>>
>>>>>>>>> If the system does not enable ACPI, cluster_id is always -1, even enable
>>>>>>>>> SCHED_CLUSTER is invalid, this is misleading. 
>>>>>>>>>
>>>>>>>>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Any reason why this can't be extended to support DT based systems via
>>>>>>>> cpu-map in the device tree. IMO we almost have everything w.r.t topology
>>>>>>>> in DT and no reason to deviate this feature between ACPI and DT.
>>>>>>>>
>>>>>>> That's the problem, we parse out "cluster" info according to the
>>>>>>> description in cpu-map, but do assign it to package_id, which used to
>>>>>>> configure the MC sched domain, not cluster sched domain.
>>>>>>>
>>>>>>
>>>>>> Right, we haven't updated the code after updating the bindings to match
>>>>>> ACPI sockets which are the physical package boundaries. Clusters are not
>>>>>> the physical boundaries and the current topology code is not 100% aligned
>>>>>> with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>>>>>> support for sockets defining package boundaries")
>>>>>
>>>>> I see, but this commit is a long time ago, why hasn't it been used widely.
>>>>> Maybe I can help about it if you need.
>>>>>
>>>>
>>>> I assume no one cared or had a requirement for the same.
>>>
>>> It took me a while to find the root cause why enabling SCHED_CLUSTER
>>> didn't work.
>>>
>>> We should add SCHED_CLUSTER's dependency before implementation.
>>> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
>>> will have this problem.
>>>
>> 
>> I am fine with that or mark it broken for DT, but ideally I wouldn't
>> want to create unnecessary dependency on ACPI or DT when both supports
>> the feature.
>
>IMHO trying to introduce SCHED_COMPLEX for DT next to the linux-wide
>available SCHED_CLUSTER (used only for ACPI right now) is the wrong way.
>
>_If_ asymmetric sub-clustering of CPUs underneath LLC (L3) makes any
>sense on ARMv9 single DSU systems like:
>
>      .---------------.
>CPU   |0 1 2 3 4 5 6 7|
>      +---------------+
>uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>      +---------------+
>  L2  |   |   | | | | |
>      +---------------+
>  L3  |<--         -->|
>      +---------------+
>      |<-- cluster -->|
>      +---------------+
>      |<--   DSU   -->|
>      '---------------'
>
>(and I'm not saying here it does!) then the existing SCHED_CLUSTER
>should be used in DT as well. Since it provides exactly the same
>functionality for the task scheduler no matter whether it's setup from
>ACPI or DT.
>
>parse_cluster() -> parse_core() should be changed to be able to decode
>both id's (package_id and cluster_id) in this case.

Totally agree, but not implemented yet. Because now cluster_id is used
to describe the package/socket, the modification will involve all DTS.

>
>DT's cpu_map would have to be changed to code 2 level setups.
>
>For a system like the one above it should look like:
>
>                cpu-map {
>                        cluster0 {
>                                foo0 {
>                                        core0 {
>                                                cpu = <&cpu0>;
>                                        };
>                                        core1 {
>                                                cpu = <&cpu1>;
>                                        };
>                                };
>                                foo2 {
>                                        core2 {
>                                                cpu = <&cpu2>;
>                                        };
>                                        core3 {
>                                                cpu = <&cpu3>;
>                                        };
>                                };
>                        };
>                        cluster1 {
>                                core0 {
>                                        cpu = <&cpu4>;
>                                };
>                                core1 {
>                                        cpu = <&cpu5>;
>                                };
>                                core2 {
>                                        cpu = <&cpu6>;
>                                };
>                        };
>                        cluster2 {
>                                core0 {
>                                        cpu = <&cpu7>;
>                                };
>                        };
>                };
>
>I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>L2-complexes on the LITTLES and I get:

This system is exactly what I mentioned, but I have a question,
How did you parse out the cluster_id based on foo0/foo2?
Because if ACPI is not used, cluster_id is always -1.

What I want to do is to change the foo0/foo2 to complex0/complex2 here,
then parse it like parse_cluster() -> parse_complex() -> parse_core().

>
># cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>DIE
>
>cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]
>
>cpu0 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu1 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu2 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu3 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu4 0
>domain0 70
>domain1 ff
>cpu5 0
>domain0 70
>domain1 ff
>cpu6 0
>domain0 70
>domain1 ff
>cpu7 0
>domain0 ff
>
>Like I mentioned earlier, I'm not sure if this additional complexity
>makes sense on mobile systems running EAS (since only CFS load-balancing
>on little CPUs would be affected).
>
>But my hunch is that this setup is what you want for your system. If we
>could agree on this one, that would already be some progress to see the
>entire story here.

Yes, that's what I want, but still a little confused, why we use MC to
describe "cluster" and use CLS describe "complex", can you show some details?

Thanks,
Qing

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ