lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 27 Mar 2024 13:03:42 -0700
From: Tony Luck <tony.luck@...el.com>
To: Fenghua Yu <fenghua.yu@...el.com>,
	Reinette Chatre <reinette.chatre@...el.com>,
	Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>,
	Peter Newman <peternewman@...gle.com>,
	James Morse <james.morse@....com>,
	Babu Moger <babu.moger@....com>,
	Drew Fustini <dfustini@...libre.com>
Cc: x86@...nel.org,
	linux-kernel@...r.kernel.org,
	patches@...ts.linux.dev,
	Tony Luck <tony.luck@...el.com>
Subject: [PATCH 00/10] Add support for Sub-NUMA cluster (SNC) systems

This series on top of v6.9-rc1 plus these two patches:

Link: https://lore.kernel.org/all/20240308213846.77075-1-tony.luck@intel.com/

The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
that share an L3 cache into two or more sets. This plays havoc with the
Resource Director Technology (RDT) monitoring features.  Prior to this
patch Intel has advised that SNC and RDT are incompatible.

Some of these CPU support an MSR that can partition the RMID counters in
the same way. This allows monitoring features to be used. With the caveat
that users must be aware that Linux may migrate tasks more frequently
between SNC nodes than between "regular" NUMA nodes, so reading counters
from all SNC nodes may be needed to get a complete picture of activity
for tasks.

Cache and memory bandwidth allocation features continue to operate at
the scope of the L3 cache.

This is a new approach triggered by the discussions that started with
"How can users tell that SNC is enabled?" but then drifted into
whether users of the legacy interface would really get what they
expected when reading from monitor files in the mon_L3_* directories.

During that discussion I'd mentioned providing monitor values for both
the L3 level, and also for each SNC node. That would provide full ABI
compatibility while also giving the finer grained reporting from each
SNC node.

Implementation sets up a new rdt_resource to track monitor resources
with domains for each SNC node. This resource is only used when SNC
mode is detected.

On SNC systems there is a parent-child relationship between the
old L3 resource and the new SUBL3 resource. Reading from legacy
files like mon_data/mon_L3_00/llc_occupancy reads and sums the RMID
counters from all "child" domains in the SUBL3 resource. E.g. on
an SNC3 system:

$ grep . mon_L3_01/llc_occupancy mon_L3_01/*/llc_occupancy
mon_L3_01/llc_occupancy:413097984
mon_L3_01/mon_SUBL3_03/llc_occupancy:141484032
mon_L3_01/mon_SUBL3_04/llc_occupancy:135659520
mon_L3_01/mon_SUBL3_05/llc_occupancy:135954432

So the L3 occupancy shows the total L3 occupancy which is
the sum of the cache occupancy on each of the SNC nodes
that share that L3 cache instance.

Patch 0001 has been salvaged from the previous postings.
All the rest are new.

Signed-off-by: Tony Luck <tony.luck@...el.com>

Tony Luck (10):
  x86/resctrl: Prepare for new domain scope
  x86/resctrl: Add new rdt_resource for sub-node monitoring
  x86/resctrl: Add new "enabled" state for monitor resources
  x86/resctrl: Add pointer to enabled monitor resource
  x86/resctrl: Add parent/child information to rdt_resource and
    rdt_domain
  x86/resctrl: Update mkdir_mondata_subdir() for Sub-NUMA domains
  x86/resctrl: Update rmdir_mondata_subdir_allrdtgrp() for Sub-NUMA
    domains
  x86/resctrl: Mark L3 monitor files with summation flag.
  x86/resctrl: Update __mon_event_count() for Sub-NUMA domains
  x86/resctrl: Determine Sub-NUMA configuration

 include/linux/resctrl.h                   |  20 ++-
 arch/x86/include/asm/msr-index.h          |   1 +
 arch/x86/kernel/cpu/resctrl/internal.h    |  23 ++-
 arch/x86/kernel/cpu/resctrl/core.c        |  76 +++++++---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |   3 +-
 arch/x86/kernel/cpu/resctrl/monitor.c     | 136 +++++++++++++++--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |   6 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 170 +++++++++++++++++-----
 8 files changed, 364 insertions(+), 71 deletions(-)

-- 
2.44.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ