lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 19 May 2022 10:21:52 +0100
From:   Sudeep Holla <sudeep.holla@....com>
To:     Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     Qing Wang <wangqing@...o.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] arch_topology: Use llc_id instead of package_id

On Wed, May 18, 2022 at 05:54:23PM +0200, Dietmar Eggemann wrote:
> On 18/05/2022 12:43, Sudeep Holla wrote:
> > On Wed, May 18, 2022 at 12:23:35PM +0200, Dietmar Eggemann wrote:
> >> On 17/05/2022 12:57, Sudeep Holla wrote:
> > [...]
> 
> [...]
> 
> >> I see. Your requirement is valid information under
> >> /sys/devices/system/cpu/cpuX/{topology, cache} for userspace.
> >>
> >> I'm not sure that we can get core_siblings_list and package_cpus_list
> >> correctly setup.
> >>
> >> With your patch we have now:
> >>
> >> root@...o:/sys/devices/system/cpu/cpu0/topology# cat core_siblings_list
> >> 0-5
> >> root@...o:/sys/devices/system/cpu/cpu0/topology# cat package_cpus_list
> >> 0-5
> >>
> >> Before we had [0,3-5] for both.
> >>
> > 
> > Correct and that was pure wrong for a single socket system. It must
> > cover all the CPUs. Tools like lscpu reports them as 2 socket system
> > which is wrong. There were reports for that but no one really chased it
> > up to get it fixed. So assumed it is not so important but still it is
> > wrong. ACPI platforms cared and wanted it fixed with ACPI PPTT since
> > the PPTT support. So check the difference without my patches on Juno
> > in DT and ACPI boot. We should get same output for both.
> > 
> >>
> >> I'm afraid we can have 2 different mask here because of:
> 
> Sorry, s/can/can't
>

No worries I assumed and read it so.

> >> 72 define_siblings_read_func(core_siblings, core_cpumask);
> >>                                             ^^^^^^^^^^^^
> >> 88 define_siblings_read_func(package_cpus, core_cpumask);
> >>                                            ^^^^^^^^^^^^
> >>
> > 
> > Yes even I got confused and the Documentation revealed one is deprecated
> > or obsolete(Documentation/ABI/stable/sysfs-devices-system-cpu)
> > 
> >  74 What:           /sys/devices/system/cpu/cpuX/topology/package_cpus
> >  75 Description:    internal kernel map of the CPUs sharing the same physical_package_id.
> >  76                 (deprecated name: "core_siblings").
> >  77 Values:         hexadecimal bitmask.
> >  78
> >  79 What:           /sys/devices/system/cpu/cpuX/topology/package_cpus_list
> >  80 Description:    human-readable list of CPUs sharing the same physical_package_id.
> >  81                 The format is like 0-3, 8-11, 14,17.
> >  82                 (deprecated name: "core_siblings_list")
> >  83 Values:         decimal list.
> 
> Ah, that makes it more clear to me! Thanks. So DIE boundary, i.e.
> cpu_cpu_mask() {return cpumask_of_node(cpu_to_node(cpu))} is not related
> to package cpumask at all. This is why we have the first if condition in
> cpu_coregroup_mask():
> 
> 658 const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
> ...
> 661 if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
> 662  /* not numa in package, lets use the package siblings */
> 663   core_mask = &cpu_topology[cpu].core_sibling;
>

Correct, either I got confused with the names it was different from what
I had seen last time I looked at these more than couple of years ago.

> [...]
> 
> >> OK, we don't support Phantom domains via 1. level Cluster in cpu-map
> >> anymore. We already explicitly informed the Android community. But I'm
> >> sure people will only discover this if something breaks on their
> >> platforms and they are able to detect this.
> >>
> > 
> > Again if it was working just by cache with their own phantom domains that
> > are not upstream, then nothing is broken. At-least I see that we have now
> > fixed and aligned DT with ACPI. While I understand your concern, I see
> > this as main issue and not sched domains which may or may not be using
> > topology info incorrectly.
> 
> OK, lets see how you incorporated llc into the topology code in your v2
> first.
>

Sure, thanks for agreeing to review those implicitly 😄.

> >>>> If we say we only support L2 sharing (asymmetric here since only CPU0-3
> >>>> have it !!!) and we don't support Phantom then your approach will work
> >>>> for such systems.
> >>>
> >>> Thanks, as I said what is *Phantom* domain ;) ? Can you point me to the
> >>> DT or ACPI binding for the same ? Just kidding, I know they don't exist.
> >>
> >> They do ;-) 1. level Clusters ... but they are used for uArch
> >> boundaries, not for LLC boundaries. That's the existing issue in DT,
> >> topology information has 2 sources: (1) cpu-map and (2) cache info.
> >>
> > 
> > Sure, they can fix or add more optimisation on top of what I have sent[1]
> 
> If you add llc parsing in your v2, they should have everything they need
> for Armv9 with uArch boundaries and L2 complexes. What I'm interested in
> is seeing how we solve the 2 sources (cache and cpu-map) issue here.

Not sure if just llc with resolve what we need on those Armv9 systems unless
L2 = LLC. If L2 != LLC, then my patches won't help that much.

> Example:
> 
> cpu-map:
> 
>  socket
>   cluster <-- (1)
>    core
>     thread
> 
> cache:
> 
>  llc
> 
> How do people know that (1) is L2 and not LLC?
>

We can get the cpumask of cpus sharing each unique cache in the system.
That is not a problem, we have cacheinfo driver for that. But that is
initialised quite late for a reason and if we need that info much early
then that needs some reworking.

My patches is addressing only LLC and hence much simpler.

> But let us switch the discussion to you v2 on this one. I have to see
> your implementation first.
>

Sure, I just replied here just to address if there were any open questions.
Let us discuss any issues in the below mail thread.

> > [1] https://lore.kernel.org/lkml/20220518093325.2070336-1-sudeep.holla@arm.com
> 
> [...]

-- 
Regards,
Sudeep

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ