lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170627174411.gheip4jmra2ihuhq@pd.tnic>
Date:   Tue, 27 Jun 2017 19:44:11 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, leo.duran@....com,
        yazen.ghannam@....com, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 1/2] x86/CPU/AMD: Present package as die instead of socket

On Tue, Jun 27, 2017 at 11:54:12PM +0700, Suravee Suthikulpanit wrote:
> The 8 threads sharing each L3 are already in the same sched-domain1 (MC
> CCX). So, cpu0 is in the same sched-domain1 as cpu1,2,3,64,65,66,67. Here,
> we need the DIE sched-domain because it represents all cpus that are in the
> same NUMA node (since we have one memory controller per DIE).

So this is still confusing. Please drop the "DIE sched-domain" as that is
something you're trying to define and I'm trying to parse what you're
trying to define and why.

> IIUC, for Zen, w/o the DIE sched-domain, the scheduler could try to
> re-balance the tasks from one CCX (schedule group) to another CCX
> across NUMA node, and

CCX, schedule group, NUMA node, ... now my head is spinning. Do you
see what I mean with agreeing on the nomenclature and proper term
definitions first?

> potentially causing unnecessary performance due to remote memory access.
> 
> Please note also that SRAT/SLIT information are used to derive the NUMA
> sched-domains, while the DIE sched-domain is non-NUMA sched-domain (derived
> from CPUID topology extension which is available on newer families).

So let's try to discuss this without using DIE sched-domain, CCX, etc,
and let's start simple.

So in that die graphic:

              ----------------------------
          C0  | T0 T1 |    ||    | T0 T1 | C4
              --------|    ||    |--------
          C1  | T0 T1 | L3 || L3 | T0 T1 | C5
              --------|    ||    |--------
          C2  | T0 T1 | #0 || #1 | T0 T1 | C6
              --------|    ||    |--------
          C3  | T0 T1 |    ||    | T0 T1 | C7
              ----------------------------

you want all those threads to belong to a single scheduling group.
Correct?

Now that thing has a memory controller attached to it, correct?

If so, why is this thing not a logical NUMA node, as described in
SRAT/SLIT?

If not, what does a NUMA node entail on Zen as described by SRAT/SLIT?
I.e., what is the difference between the two things? I.e., how many dies
as above are in a NUMA node?

Now, SRAT should contain the assignment which core belongs to which
node. Why is that not sufficient?

Ok, that should be enough questions for now. Let's start with them.

Thanks.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ