lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Jun 2023 22:58:20 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     K Prateek Nayak <kprateek.nayak@....com>,
        <linux-kernel@...r.kernel.org>,
        <linux-tip-commits@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
        <x86@...nel.org>, Gautham Shenoy <gautham.shenoy@....com>,
        Tim Chen <tim.c.chen@...el.com>
Subject: Re: [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling()

On 2023-06-14 at 10:17:57 +0200, Peter Zijlstra wrote:
> On Tue, Jun 13, 2023 at 04:00:39PM +0530, K Prateek Nayak wrote:
> 
> > >> - SIS_NODE_TOPOEXT - tip:sched/core + this patch
> > >>                      + new sched domain (Multi-Multi-Core or MMC)
> > >> 		     (https://lore.kernel.org/all/20230601153522.GB559993@hirez.programming.kicks-ass.net/)
> > >> 		     MMC domain groups 2 nearby CCX.
> > > 
> > > OK, so you managed to get the NPS4 topology in NPS1 mode?
> > 
> > Yup! But it is a hack. I'll leave the patch at the end.
> 
> Chen Yu, could we do the reverse? Instead of building a bigger LLC
> domain, can we split our LLC based on SNC (sub-numa-cluster) topologies?
>
Hi Peter,
Do you mean with SNC enabled, if the LLC domain gets smaller? 
According to the test, the answer seems to be yes.

SNC enabled:
 grep . /sys/kernel/debug/sched/domains/cpu0/domain*/{name,flags}
/sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu0/domain1/name:MC
/sys/kernel/debug/sched/domains/cpu0/domain2/name:NUMA
/sys/kernel/debug/sched/domains/cpu0/domain3/name:NUMA
/sys/kernel/debug/sched/domains/cpu0/domain0/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_CPUCAPACITY SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
/sys/kernel/debug/sched/domains/cpu0/domain1/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
/sys/kernel/debug/sched/domains/cpu0/domain2/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SERIALIZE SD_OVERLAP SD_NUMA
/sys/kernel/debug/sched/domains/cpu0/domain3/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SERIALIZE SD_OVERLAP SD_NUMA

The MC domain1 has the SD_SHARE_PKG_RESOURCES flag, while
the sub-NUMA domain2 does not have it.

cat /proc/schedstat | grep cpu0 -A 4
cpu0 0 0 0 0 0 0 737153151491 189570367069 38103461
domain0 00000000,00000000,00000000,00010000,00000000,00000000,00000001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,00000000,00000fff,ffff0000,00000000,00000000,0fffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 00000000,000000ff,ffffffff,ffff0000,00000000,00ffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain3 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

SNC disabled:
grep . /sys/kernel/debug/sched/domains/cpu0/domain*/{name,flags}
/sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
/sys/kernel/debug/sched/domains/cpu0/domain1/name:MC
/sys/kernel/debug/sched/domains/cpu0/domain2/name:NUMA
/sys/kernel/debug/sched/domains/cpu0/domain0/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_CPUCAPACITY SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
/sys/kernel/debug/sched/domains/cpu0/domain1/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SHARE_PKG_RESOURCES SD_PREFER_SIBLING
/sys/kernel/debug/sched/domains/cpu0/domain2/flags:SD_BALANCE_NEWIDLE SD_BALANCE_EXEC SD_BALANCE_FORK SD_WAKE_AFFINE SD_SERIALIZE SD_OVERLAP SD_NUMA


cat /proc/schedstat | grep cpu0 -A 4
cpu0 0 0 0 0 0 0 1030156602546 31734021553 1590389
domain0 00000000,00000000,00000000,00010000,00000000,00000000,00000001 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain1 00000000,000000ff,ffffffff,ffff0000,00000000,00ffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
domain2 ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cpu1 0 0 0 0 0 0 747972895325 29319670483 1420637

> Because as you know, Intel chips are having the reverse problem of the
> LLC being entirely too large, so perhaps we can break it up along the
> SNC lines.
> 
> Could you see if that works?
I think spliting LLC would help. Or do you mean even with SNC disabled, are
we able to detect the SNC border? Currently the numa topo is detected from ACPI
table, not sure if SNC is disabled in BIOS, can the OS detect that. I'll have
a check.

thanks,
Chenyu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ