lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1759515405.git.tim.c.chen@linux.intel.com>
Date: Fri,  3 Oct 2025 12:31:26 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>
Cc: Tim Chen <tim.c.chen@...ux.intel.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Tim Chen <tim.c.chen@...el.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Len Brown <len.brown@...el.com>,
	linux-kernel@...r.kernel.org,
	Chen Yu <yu.c.chen@...el.com>,
	K Prateek Nayak <kprateek.nayak@....com>,
	"Gautham R . Shenoy" <gautham.shenoy@....com>,
	Zhao Liu <zhao1.liu@...el.com>,
	Vinicius Costa Gomes <vinicius.gomes@...el.com>,
	Arjan Van De Ven <arjan.van.de.ven@...el.com>
Subject: [PATCH v5 0/2] Fix NUMA sched domain build errors for GNR and CWF

While testing Granite Rapids (GNR) and Clearwater Forest (CWF) systems
in SNC-3 mode, we encountered sched domain build errors in dmesg.
The scheduler domain code did not expect asymmetric node distances
from a local node to multiple nodes in a remote package. As a result,
remote nodes ended up being grouped partially with local nodes with
asymemtric groupings, and creating too many levels in the NUMA sched
domain hierarchy.

To address this, we simplify remote node distances for the purpose of
sched domain construction on GNR and CWF. Specifically, we replace the
individual distances to nodes within the same remote package with their
average distance. This resolves the domain build errors and reduces the
number of NUMA sched domain levels.

The actual SLIT NUMA node distances are still preserved separately, in
case they are needed when building sched domains. NUMA balancing
continues to use the true distances when selecting a closer remote node
for a task’s numa_group.

This version also revises the detection of arch_sched_node_distance()
substitution with a cleaner implementation, thanks to Chen Yu’s
suggestion.  Much appreciation as well to Pratek for reviewing earlier
versions.  Pratek, if this revision looks good to you, please consider
adding your Reviewed-by.

Thanks,
Tim

Changes in v5:
- Reise detection of arch_sched_node_distance() replacement with a
  cleaner implementtion.
- Link to v4: https://lore.kernel.org/lkml/cover.1758234869.git.tim.c.chen@linux.intel.com/ 

Changes in v4:
- Move average node distance computation to x86 specific code
- Put all the changes under CONFIG_NUMA.
- Use __free() to simplify code.
- Allocate separate distance array only if node distances are
  modified.
- Assert that we don't have more than 2 packages for GNR/CWF
  when replacing remote node distances with average remote node
  distance.
- Comments and code style clean ups.
- Link to v3:
  https://lore.kernel.org/lkml/cover.1757614784.git.tim.c.chen@linux.intel.com/

Changes in v3:
- Simplify sched_record_numa_dist() by getting rid of max distance
  computation. 
- minor clean ups.
- Link to v2:
  https://lore.kernel.org/lkml/61a6adbb845c148361101e16737307c8aa7ee362.1757097030.git.tim.c.chen@linux.intel.com/

Changes in v2:
- Allow modification of NUMA distances by architecture to be the
  sched domain NUMA distances for building sched domains to
  simplify NUMA domains.
  Maintain separate NUMA distances for the purpose of building
  sched domains from actual NUMA distances.
- Use average remote node distance as the distance to nodes in remote
  packages for GNR and CWF.
- Remove the original fix for topology_span_sane() that's superseded
  by better fix from Pratek.
  https://lore.kernel.org/lkml/175688671425.1920.13690753997160836570.tip-bot2@tip-bot2/.
- Link to v1: https://lore.kernel.org/lkml/cover.1755893468.git.tim.c.chen@linux.intel.com/


Tim Chen (2):
  sched: Create architecture specific sched domain distances
  sched/topology: Fix sched domain build error for GNR, CWF in SNC-3
    mode

 arch/x86/kernel/smpboot.c      |  70 +++++++++++++++++++++
 include/linux/sched/topology.h |   1 +
 kernel/sched/topology.c        | 108 ++++++++++++++++++++++++++-------
 3 files changed, 157 insertions(+), 22 deletions(-)

-- 
2.32.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ