lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Feb 2019 10:59:19 -0500
From:   Len Brown <lenb@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     X86 ML <x86@...nel.org>, linux-kernel@...r.kernel.org,
        Len Brown <len.brown@...el.com>, linux-doc@...r.kernel.org
Subject: Re: [PATCH 03/11] x86 topology: Add CPUID.1F multi-die/package support

On Tue, Feb 26, 2019 at 8:54 AM Peter Zijlstra <peterz@...radead.org> wrote:

> > > It would've been nice to have the CPUID instruction 1F leaf reference
> > > 3B-3.9 in the SDM, and maybe mention this here too.
> >
> > I didn't mention SDM sections because they change -- leaving stale
> > pointers in our commit messages.  The SDM is re-published 4 times per
> > year.
>
> Yah, I know. Which is why I keep all SDMs. So if you say, book 3 section
> 8 of Jul'17, I can find it :-)

The SDM is like software -- usually (but not always) you are better
off with the latest version:-)

> > Cache enumeration in Leaf-4 is totally unchanged.
> > ACPI NUMA tables are totally unchanged.
>
> Sure; and yet Sub-NUMA-Clustering broke stuff in interesting ways. I'm
> trying to get a feel for how these levels will interact with all that.
>
> Before that SNC stuff, caches had never spanned NODEs (and I still
> think that is 'creative' at best).

Yeah, SNC is sort of a curve ball.  I guess it made enough stuff run better that
it is available as an option.  But it doesn't help everything, so it is disabled
by default...

I think from a scheduler point of view, sticking with the output of
CPUID.4 for the cache topology, and the ACPI tables for the
node topology/distances, is the right strategy.

> > From a scheduler point of view, imagine that a SKX system with 4 die
> > in 4 packages was mechanically re-designed so that those 4 die resided
> > in 2 double-sized packages.
> >
> > They may have tweaked the links between the die, but logically it is
> > identical and compatible, and the legacy kernel will function
> > properly.
>
> This example has LLC in die and yes that works.
>
> But I can imagine things like L2 in tile and L3 across tiles but within
> DIE and then it _might_ make sense to still consider the tile for
> scheduling.
>
> Another option is having the LLC off die; also not unheard of.
>
> And then there's many creative and slightly crazy ways this can all be
> combined :/

If any of those crazy things happen,  CPUID.B/CPUID.1F are not
going to help software understand it -- CPUID.4 and the NUMA tables
are the tool of choice.

> > So the effect of Leaf B,1F is that it defines the scope of MSRs.  eg.
> > what processors does a die-scope MSR cover.  That is why the rest of
> > the patch is about sysfs topology, and about package MSR scope.
> >
> > Yes, there will be more exotic MSR situations in future products --
> > the first ones are pretty simple -- something  called a
> > package-scope-MSR  in the SDM today becomes a die-scope-MSR in this
> > generation on a multi-die/package system.
>
> Yes :-(
>
> > It also reflects how many packages appear in sysfs, and this can
> > effect licensing of some kinds of software.
>
> That's just plain insanity and we should not let that affect our sysfs
> interfaces.

This change isn't made for compatibility with per-package licensing.
Indeed, vendors, who license  based on package-count need to
be made aware that on a system with multi-die/package, they'll
see their package count go _down_ as a result of this change.
Thankfully, I'm told that per-package licensing is quite rare --
most stuff that cares has moved to per-CPU.

I think a good semantic side effect of this series is that it
maintains the invariant that a physical package and a socket are synonymous.
While we don't use the word "socket" in Linux anymore, the industry
broadly assume that the two are synonyms.  And people expect that a
physical package really is a physical package -- you can see it,
buy it in a box, and hold it in your hand.

Functionally, the bottom line is that it allows software to discover topology
levels that previously needed to be discovered by looking up family/model,
in the past, which was sort of annoying.  The things that care are
things that care about MSR scope.   Thankfully, the list of things that care
about MSR scope is quite finite.

thanks,
-Len




--
Len Brown, Intel Open Source Technology Center

Powered by blists - more mailing lists