[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ1PR11MB6083BEDD3F7625B52E677647FC66A@SJ1PR11MB6083.namprd11.prod.outlook.com>
Date: Thu, 29 May 2025 18:14:24 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Qinyun Tan <qinyuntan@...ux.alibaba.com>
CC: "H . Peter Anvin" <hpa@...or.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>, "Chatre,
Reinette" <reinette.chatre@...el.com>
Subject: RE: [PATCH V2 1/1] x86/resctrl: Remove unappropriate references to
cacheinfo in the resctrl subsystem.
> To resolve these issues:
>
> 1. Replace direct cacheinfo references in struct rdt_mon_domain and struct
> rmid_read with the cacheinfo ID (a unique identifier for the L3 cache).
>
> 2. The hdr.cpu_mask maintained by resctrl constitutes a subset of
> shared_cpu_map. When reading top-level events, we dynamically select a CPU
> from hdr.cpu_mask and utilize its corresponding shared_cpu_map for resctrl
> to determine valid CPUs for reading RMID counter via the MSR interface.
>
> Fixes: 328ea68874642 ("x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files")
> Signed-off-by: Qinyun Tan <qinyuntan@...ux.alibaba.com>
Took this patch on a test run on a 2 socket Granite Rapids system configured
in SNC 3 mode.
While monitoring total memory bandwidth I took the first CPU on NUMA
nodes {1..5} offline. Monitoring kept working. Brought those back online.
Still OK. Took all CPUs on NUMA node 4 offline. Still good. Brought those
CPUs back online. Still good.
Tested-by: Tony Luck <tony.luck@...el.com>
-Tony
Powered by blists - more mailing lists