lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <942b4d68-8d19-66d8-c84b-d17eba837e9a@inria.fr>
Date:   Mon, 19 Oct 2020 12:00:15 +0200
From:   Brice Goglin <Brice.Goglin@...ia.fr>
To:     Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Len Brown <len.brown@...el.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Sudeep Holla <sudeep.holla@....com>, guohanjun@...wei.com,
        Will Deacon <will@...nel.org>, linuxarm@...wei.com
Subject: Re: [RFC PATCH] topology: Represent clusters of CPUs within a die.

Le 16/10/2020 à 17:27, Jonathan Cameron a écrit :
> Both ACPI and DT provide the ability to describe additional layers of
> topology between that of individual cores and higher level constructs
> such as the level at which the last level cache is shared.
> In ACPI this can be represented in PPTT as a Processor Hierarchy
> Node Structure [1] that is the parent of the CPU cores and in turn
> has a parent Processor Hierarchy Nodes Structure representing
> a higher level of topology.
>
> For example Kunpeng 920 has clusters of 4 CPUs.  These do not share
> any cache resources, but the interconnect topology is such that
> the cost to transfer ownership of a cacheline between CPUs within
> a cluster is lower than between CPUs in different clusters on the same
> die.   Hence, it can make sense to deliberately schedule threads
> sharing data to a single cluster.
>
> This patch simply exposes this information to userspace libraries
> like hwloc by providing cluster_cpus and related sysfs attributes.
> PoC of HWLOC support at [2].
>
> Note this patch only handle the ACPI case.
>
> Special consideration is needed for SMT processors, where it is
> necessary to move 2 levels up the hierarchy from the leaf nodes
> (thus skipping the processor core level).
>
> Currently the ID provided is the offset of the Processor
> Hierarchy Nodes Structure within PPTT.  Whilst this is unique
> it is not terribly elegant so alternative suggestions welcome.
>
> Note that arm64 / ACPI does not provide any means of identifying
> a die level in the topology but that may be unrelate to the cluster
> level.
>
> RFC questions:
> 1) Naming
> 2) Related to naming, do we want to represent all potential levels,
>    or this enough?  On Kunpeng920, the next level up from cluster happens
>    to be covered by llc cache sharing, but in theory more than one
>    level of cluster description might be needed by some future system.
> 3) Do we need DT code in place? I'm not sure any DT based ARM64
>    systems would have enough complexity for this to be useful.
> 4) Other architectures?  Is this useful on x86 for example?


Hello Jonathan

Intel has CPUID registers to describe "tiles" and "modules" too (not
used yet as far as I know). The list of levels could become quite long
if any processor ever exposes those. If having multiple cluster levels
is possible, maybe it's time to think about introducing some sort of
generic levels:

cluster0_id = your cluster_id
cluster0_cpus/cpulist = your cluster_cpus/cpulis
cluster0_type = would optionally contain hardware-specific info such as
"module" or "tile" on x86
cluster_levels = 1

hwloc already does something like this for some "rare" levels such as
s390 book/drawers (by the way, thanks a lot for the hwloc PoC, very good
job), we call them "Groups" instead of "cluster" above.

However I don't know if the Linux scheduler would like that. Is it
better to have 10+ levels with static names, or a dynamic number of levels?

Brice

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ