linux-kernel - Re: [RFC][PATCH] x86, sched: allow topolgies where NUMA nodes share an LLC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 7 Nov 2017 08:22:19 -0800
From:   Dave Hansen <dave.hansen@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, tony.luck@...el.com,
        tim.c.chen@...ux.intel.com, hpa@...ux.intel.com, bp@...en8.de,
        rientjes@...gle.com, imammedo@...hat.com, prarit@...hat.com,
        toshi.kani@...com, brice.goglin@...il.com, mingo@...nel.org
Subject: Re: [RFC][PATCH] x86, sched: allow topolgies where NUMA nodes share
 an LLC

On 11/07/2017 12:30 AM, Peter Zijlstra wrote:
> On Mon, Nov 06, 2017 at 02:15:00PM -0800, Dave Hansen wrote:
> 
>> But, the CPUID for the SNC configuration discussed above enumerates
>> the LLC as being shared by the entire package.  This is not 100%
>> precise because the entire cache is not usable by all accesses.  But,
>> it *is* the way the hardware enumerates itself, and this is not likely
>> to change.
> 
> So CPUID and SRAT will remain inconsistent; even in future products?
> That would absolutely blow chunks.

It certainly isn't ideal as it stands.  If it was changed, what would it
be changed to?  You can not even represent the current L3 topology in
CPUID, at least not precisely.

I've been arguing we should optimize the CPUID information for
performance.  Right now, it's suboptimal for folks doing NUMA-local
allocations, and I think that's precisely the group of folks that needs
precise information.  I'm trying to get it changed going forward.

> If that is the case, we'd best use a fake feature like
> X86_BUG_TOPOLOGY_BROKEN and use that instead of an ever growing list of
> models in this code.

FWIW, I don't consider the current situation broken.  Nobody ever
promised the kernel that a NUMA node would never happen inside a socket,
or inside a cache boundary enumerated in CPUID.

The assumptions the kernel made were sane, but the CPU's description of
itself, *and* the BIOS-provided information are also sane.  But, the
world changed, some of those assumptions turned out to be wrong, and
somebody needs to adjust.

...
>> +	if (!topology_same_node(c, o) &&
>> +	    (c->x86_model == INTEL_FAM6_SKYLAKE_X)) {
> 
> This needs a c->x86_vendor test; imagine the fun when AMD releases a
> part with model == SKX ...

Yup, will do.

>> +		/* Use NUMA instead of coregroups for scheduling: */
>> +		x86_has_numa_in_package = true;
>> +
>> +		/*
>> +		 * Now, tell the truth, that the LLC matches. But,
>> +		 * note that throwing away coregroups for
>> +		 * scheduling means this will have no actual effect.
>> +		 */
>> +		return true;
> 
> What are the ramifications here? Is anybody else using that cpumask
> outside of the scheduler topology setup?

I looked for it and didn't see anything else.  I'll double check that
nothing has popped up since I hacked this together.