lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 16 Sep 2014 17:59:28 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Chuck Ebbert <cebbert.lkml@...il.com>, Dave Hansen <dave@...1.net>,
	linux-kernel@...r.kernel.org, borislav.petkov@....com,
	andreas.herrmann3@....com, hpa@...ux.intel.com, ak@...ux.intel.com
Subject: Re: [PATCH] x86: Consider multiple nodes in a single socket to be
 "sane"

On Tue, Sep 16, 2014 at 08:44:03AM +0200, Ingo Molnar wrote:
> 
> * Chuck Ebbert <cebbert.lkml@...il.com> wrote:
> 
> > On Tue, 16 Sep 2014 05:29:20 +0200
> > Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> > > On Mon, Sep 15, 2014 at 03:26:41PM -0700, Dave Hansen wrote:
> > > > 
> > > > I'm getting the spew below when booting with Haswell (Xeon 
> > > > E5-2699) CPUs and the "Cluster-on-Die" (CoD) feature 
> > > > enabled in the BIOS.
> > > 
> > > What is that cluster-on-die thing? I've heard it before but 
> > > never could find anything on it.
> > 
> > Each CPU has 2.5MB of L3 connected together in a ring that 
> > makes it all act like a single shared cache. The HW tries to 
> > place the data so it's closest to the CPU that uses it. On the 
> > larger processors there are two rings with an interconnect 
> > between them that adds latency if a cache fetch has to cross 
> > that. CoD breaks that connection and effectively gives you two 
> > nodes on one die.
> 
> Note that that's not really a 'NUMA node' in the way lots of 
> places in the kernel assume it: permanent placement assymetry 
> (and access cost assymetry) of RAM.

Agreed, that is not NUMA, both groups will have the exact same local
DRAM latency (unlike the AMD thing which has two memory busses on the
single package, and therefore really has two nodes on a single chip).

This also means the CoD thing sets up the NUMA masks incorrectly.

> It's a new topology construct that needs new handling (and 
> probably a new mask): Non Uniform Cache Architecture (NUCA)
> or so.

Its not new, many chips do this, most notable the Core2Quad which was
basically two Core2Duo dies in a single package. So you have 2 cores
sharing cache, and 2x2 cores sharing the memory bus.

Also, the Silvermont 'modules' which do basically the same thing, have
multiple groups of cores that share cache on the same memory bus.

This is not new. And we can represent this just fine, it just looks like
the Intel CoD stuff sets up the NUMA masks wrong.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ