linux-kernel - Re: [PATCH v2 3/8] arch_topology: Set cluster identifier in each core/thread from /cpu-map

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <26f39a9d-1a02-b77d-5c89-88a1fb0e4eac@arm.com>
Date:   Fri, 20 May 2022 14:33:19 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Ionela Voinescu <ionela.voinescu@....com>,
        Sudeep Holla <sudeep.holla@....com>
Cc:     Atish Patra <atishp@...shpatra.org>, linux-kernel@...r.kernel.org,
        Atish Patra <atishp@...osinc.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Qing Wang <wangqing@...o.com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-riscv@...ts.infradead.org, Rob Herring <robh+dt@...nel.org>
Subject: Re: [PATCH v2 3/8] arch_topology: Set cluster identifier in each
 core/thread from /cpu-map

On 19/05/2022 18:55, Ionela Voinescu wrote:
> Hi,
> 
> As said before, this creates trouble for CONFIG_SCHED_CLUSTER=y.
> The output below is obtained from Juno.
> 
> When cluster_id is populated, a new CLS level is created by the scheduler
> topology code. In this case the clusters in DT determine that the cluster
> siblings and llc siblings are the same so the MC scheduler domain will
> be removed and, for Juno, only CLS and DIE will be kept.

[...]

> To be noted that we also get a new flag SD_PREFER_SIBLING for the CLS
> level that is not appropriate. We usually remove it for the child of a
> SD_ASYM_CPUCAPACITY domain, but we don't currently redo this after some
> levels are degenerated. This is a fixable issue.
> 
> But looking at the bigger picture, a good question is what is the best
> thing to do when cluster domains and llc domains span the same CPUs?
> 
> Possibly it would be best to restrict clusters (which are almost an
> arbitrary concept) to always span a subset of CPUs of the llc domain,
> if llc siblings can be obtained? If those clusters are not properly set
> up in DT to respect this condition, cluster_siblings would need to be
> cleared (or set to the current CPU) so the CLS domain is not created at
> all.
> 
> Additionally, should we use cluster information from DT (cluster_id) to
> create an MC level if we don't have llc information, even if
> CONFIG_SCHED_CLUSTER=n?
> 
> I currently don't have a very clear picture of how cluster domains and
> llc domains would "live" together in a variety of topologies. I'll try
> other DT topologies to see if there are others that can lead to trouble.

This would be an issue. Depending on CONFIG_SCHED_CLUSTER we would get
two different systems from the viewpoint of the scheduler.

To me `cluster_id/_sibling` don't describe a certain level of CPU
grouping (e.g. one level above core or one level below package).

They were introduced to describe one level below LLC (e.g. Kunpeng920 L3
(24 CPUs LLC) -> L3 tag (4 CPUs) or x86 Jacobsville L3 -> L2), (Commit
                 ^^^^^^                                   ^^
c5e22feffdd7 ("topology: Represent clusters of CPUs within a die")).

The Ampere Altra issue already gave us a taste of the possible issues of
this definition, commit db1e59483dfd ("topology: make core_mask include
at least cluster_siblings").

If we link `cluster_id/_sibling` against (1. level) cpu-map cluster
nodes plus using llc and `cluster_sibling >= llc_sibling` we will run
into these issues.