lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <658d9869-ef22-48a7-876a-5bbba4f134ff@amd.com>
Date: Thu, 13 Jun 2024 14:17:18 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghua.yu@...el.com>,
 Reinette Chatre <reinette.chatre@...el.com>,
 Maciej Wieczor-Retman <maciej.wieczor-retman@...el.com>,
 Peter Newman <peternewman@...gle.com>, James Morse <james.morse@....com>,
 Drew Fustini <dfustini@...libre.com>, Dave Martin <Dave.Martin@....com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, patches@...ts.linux.dev
Subject: Re: [PATCH v20 00/18] Add support for Sub-NUMA cluster (SNC) systems

Hi Reinette,

I may be little bit out of sync here. Also, sorry to come back late in the
series.

Looking at the series again, I see this approach adds lots of code.
Look at this structure.


@@ -187,10 +196,12 @@ struct rdt_resource {
 	bool			alloc_capable;
 	bool			mon_capable;
 	int			num_rmid;
-	enum resctrl_scope	scope;
+	enum resctrl_scope	ctrl_scope;
+	enum resctrl_scope	mon_scope;
 	struct resctrl_cache	cache;
 	struct resctrl_membw	membw;
-	struct list_head	domains;
+	struct list_head	ctrl_domains;
+	struct list_head	mon_domains;
 	char			*name;
 	int			data_width;
 	u32			default_ctrl;

There are two scope fields.
There are two domains fields.

These are very confusing and very hard to maintain. Also, I am not sure if
these fields are useful for anything other than SNC feature. This approach
adds quite a bit of code for no specific advantage.

Why don't we just split the RDT_RESOURCE_L3 resource
into separate resources, one for control, one for monitoring.
We already have "control" only resources (MBA, SMBA, L2). Lets create new
"monitor" only resource. I feel it will be much cleaner approach.

Tony has already tried that approach and showed that it is much simpler.

v15-RFC :
https://lore.kernel.org/lkml/20240130222034.37181-1-tony.luck@intel.com/

What do you think?

Thanks
Babu


On 6/10/24 13:35, Tony Luck wrote:
> This series based on top of tip x86/cache commit f385f0246394
> ("x86/resctrl: Replace open coded cacheinfo searches")
> 
> The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
> that share an L3 cache into two or more sets. This plays havoc with the
> Resource Director Technology (RDT) monitoring features.  Prior to this
> patch Intel has advised that SNC and RDT are incompatible.
> 
> Some of these CPUs support an MSR that can partition the RMID counters
> in the same way. This allows monitoring features to be used. Legacy
> monitoring files provide the sum of counters from each SNC node for
> backwards compatibility. Additional  files per SNC node provide details
> per node.
> 
> Memory bandwidth allocation features continue to operate at
> the scope of the L3 cache.
> 
> L3 cache occupancy and allocation operate on the portion of
> L3 cache available for each SNC node.
> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> 
> ---
> Changes since v19: https://lore.kernel.org/all/20240528222006.58283-1-tony.luck@intel.com/
> 
> 1-4:	Refactor on top of <linux/cacheinfo.h> change.
> 	Nothing functional.
> 
> 5:	No change
> 
> 6:	Updated commit message with note about RMID Sharing mode.
> 	Renamed __rmid_read() to __rmid_read_phys() and performed
> 	translation from logical RMID to physical RMID at callsites.
> 	Updated comment for __rmid_read_phys() with explanation of
> 	logical/physical RMIDs. Consistently use "SNC node" avoid
> 	SNC domain. Add specifics for non-SNC mode.
> 	Joined split line on __rmid_read() definition (even with the
> 	added "_phys" to its name still fits on one line.
> 
> 7:	No change
> 
> 8:	get_cpu_cacheinfo_level() moved to <linux/cacheinfo.h>
> 	currently in tip x86/cache
> 	no other changes
> 
> 9:	Dropped the "sumdomains" field from struct rmid_read (a NULL
> 	domain field now indicates that summing is needed).
> 	Fix kerneldoc comments for struct rmid_read.
> 	Updated commit comments with more "why" than "what".
> 
> 10:	No change
> 
> 11:	Fix commit comments per suggestions
> 	1) Added some "why it is OK to take a bit from evtid"
> 	2) s/The stolen bit is given to/Give the bit to/
> 	3) Don't use "l3_cache_id" (which looks like a variable)
> 
> 12:	Fix commit message.
> 	s/kernfs_find_and_get_ns()/kernfs_find_and_get()/
> 	Add kernfs_put() to drop hold from kernfs_find_and_get()
> 	Drop useless "/* create the directory */" comment.
> 
> 13:	Add kernfs_put() to drop hold from kernfs_find_and_get() [two places]
> 
> 14:	Add cpumask parameter to mon_event_read() so SNC decsions are
> 	all in rdtgroup_mondata_show() instead of spread between functions.
> 	Add comments in rdtgroup_mondata_show() to explain the sum vs. no-sum
> 	cases.
> 	Moved the mon_event_read() call into both arms of the if-else
> 	instead of "d = NULL; goto got_cacheinfo;"
> 
> 15:	New (replaces 15-17). Make __mon_event_read() do the sum across
> 	domains (at filesystem level). Move the CPU/domain sanity check out
> 	of resctrl_arch_rmid_read() and into __mon_event_read()
> 	with separate scope tests for single domain vs. sum over
> 	domains.
> 
> 16:	[Was 18] Update commit message with details about MSR 0xCA0, what changes
> 	when bit 0 is cleared, and why this is necessary.
> 	Dropped "Add an architecture specific hook" language from
> 	commit message.
> 
> 17:	[Was 19] Drop "and enabling" from shortlog (enabling done by
> 	previous commit).
> 	Added checks that cpumask_weight() isn't returning zero (to keep
> 	static checkers from warning of possible divide by zero).
> 
> 18:	[Was 20] Fix some "Sub-NUMA" references to say "Sub-NUMA Cluster"
> 	Added document section on effect of SNC mode on MBA and L3 CAT.
> 
> Tony Luck (18):
>   x86/resctrl: Prepare for new domain scope
>   x86/resctrl: Prepare to split rdt_domain structure
>   x86/resctrl: Prepare for different scope for control/monitor
>     operations
>   x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
>   x86/resctrl: Add node-scope to the options for feature scope
>   x86/resctrl: Introduce snc_nodes_per_l3_cache
>   x86/resctrl: Block use of mba_MBps mount option on Sub-NUMA Cluster
>     (SNC) systems
>   x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files
>   x86/resctrl: Add a new field to struct rmid_read for summation of
>     domains
>   x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function
>   x86/resctrl: Allocate a new field in union mon_data_bits
>   x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files
>   x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC)
>     mode
>   x86/resctrl: Fill out rmid_read structure for smp_call*() to read a
>     counter
>   x86/resctrl: Make __mon_event_count() handle sum domains
>   x86/resctrl: Enable RMID shared RMID mode on Sub-NUMA Cluster (SNC)
>     systems
>   x86/resctrl: Sub-NUMA Cluster (SNC) detection
>   x86/resctrl: Update documentation with Sub-NUMA cluster changes
> 
>  Documentation/arch/x86/resctrl.rst        |  27 ++
>  include/linux/resctrl.h                   |  87 ++++--
>  arch/x86/include/asm/msr-index.h          |   1 +
>  arch/x86/kernel/cpu/resctrl/internal.h    |  93 +++++--
>  arch/x86/kernel/cpu/resctrl/core.c        | 312 ++++++++++++++++------
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  85 +++---
>  arch/x86/kernel/cpu/resctrl/monitor.c     | 242 ++++++++++++++---
>  arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  27 +-
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 272 ++++++++++++-------
>  9 files changed, 835 insertions(+), 311 deletions(-)
> 
> 
> base-commit: f385f024639431bec3e70c33cdbc9563894b3ee5

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ