lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <215dc37f-bfcf-4560-8f31-3774307f4ede@intel.com>
Date: Fri, 25 Jul 2025 16:53:57 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
 Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
	<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
	<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
	<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v7 27/31] x86,fs/resctrl: Move RMID initialization to
 first mount

Hi Tony,

On 7/11/25 4:53 PM, Tony Luck wrote:
> The resctrl file system code assumed that the only monitor events were

resctrl assumes ... etc. etc.

> tied to the RDT_RESOURCE_L3 resource. Also that the number of supported
> RMIDs was enumerated during early initialization.
> 
> RDT_RESOURCE_PERF_PKG breaks both of those assumptions.

Please give detail how assumptions are broken. 

> 
> Delay the final enumeration of the number of RMIDs and subsequent
> allocation of structures until first mount of the resctrl file system
> so that the number of usable RMIDs can be computed as the minimum
> value from all enabled monitor resources.

This needs more thought. The idea of "final enumeration of the number of RMIDs"
does not exist. This patch modifies resctrl_arch_system_num_rmid_idx() to
compute number of RMIDs differently but there is *no* change to when
resctrl_arch_system_num_rmid_idx() is called. For example, 
resctrl_arch_system_num_rmid_idx() is called as part of L3 monitor domain
initialization during CPU online that can happen long before resctrl mount.
resctrl_arch_system_num_rmid_idx() will return the number of RMIDs known
at that time that may be different from the "final enumeration". The
L3 monitor domain structures are thus created with potentially a
different RMID count than what the system will end up being able to use.

Claiming that this "delay of final enumeration of number of RMIDs and subsequent
allocation of structures until first mount" applies to all enabled monitor
resources is false.

Allocating the L3 monitoring data structures early based on RMID values known at
that time may be ok based on how system-wide RMIDs are chosen (it can only be smaller).
This will thus result in wasted space but I expect will work just fine.
A comment similar to that used during closid_num_dirty_rmid
could work but hiding implications like this just makes resctrl code
harder to understand and maintain.

> Since the dom_data* functions now only allocate/free structures
> used for RMIDs, rename: dom_data_init() -> rmid_init(),
> dom_data_exit() -> rmid_exit().

These names seem very generic. How about setup_rmid_lru_list()/free_rmid_lru_list()?

> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
>  fs/resctrl/internal.h              |  2 ++
>  arch/x86/kernel/cpu/resctrl/core.c |  8 ++++++--
>  fs/resctrl/monitor.c               | 26 +++++++++-----------------
>  fs/resctrl/rdtgroup.c              |  6 ++++++
>  4 files changed, 23 insertions(+), 19 deletions(-)
> 
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 28d505efdb7c..7fca1849742f 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -358,6 +358,8 @@ int alloc_rmid(u32 closid);
>  
>  void free_rmid(u32 closid, u32 rmid);
>  
> +int rmid_init(void);
> +
>  int resctrl_mon_l3_resource_init(void);
>  
>  void resctrl_mon_l3_resource_exit(void);
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 31fb598482bf..1a6635cc5b37 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -112,10 +112,14 @@ struct rdt_hw_resource rdt_resources_all[RDT_NUM_RESOURCES] = {
>  
>  u32 resctrl_arch_system_num_rmid_idx(void)
>  {
> -	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	u32 num_rmids = U32_MAX;
> +	struct rdt_resource *r;
> +
> +	for_each_mon_capable_rdt_resource(r)
> +		num_rmids = min(num_rmids, r->num_rmid);
>  
>  	/* RMID are independent numbers for x86. num_rmid_idx == num_rmid */
> -	return r->num_rmid;
> +	return num_rmids == U32_MAX ? 0 : num_rmids;
>  }
>  
>  struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index e3eceba70713..3fe81c43e5e8 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -833,20 +833,19 @@ void mbm_setup_overflow_handler(struct rdt_l3_mon_domain *dom, unsigned long del
>  		schedule_delayed_work_on(cpu, &dom->mbm_over, delay);
>  }
>  
> -static int dom_data_init(struct rdt_resource *r)
> +int rmid_init(void)
>  {
>  	u32 idx_limit = resctrl_arch_system_num_rmid_idx();
>  	struct rmid_entry *entry = NULL;
> -	int err = 0, i;
>  	u32 idx;
> +	int i;
>  
> -	mutex_lock(&rdtgroup_mutex);
> +	if (rmid_ptrs)
> +		return 0;
>  
>  	rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL);
> -	if (!rmid_ptrs) {
> -		err = -ENOMEM;
> -		goto out_unlock;
> -	}
> +	if (!rmid_ptrs)
> +		return -ENOMEM;
>  
>  	for (i = 0; i < idx_limit; i++) {
>  		entry = &rmid_ptrs[i];
> @@ -866,13 +865,10 @@ static int dom_data_init(struct rdt_resource *r)
>  	entry = __rmid_entry(idx);
>  	list_del(&entry->list);
>  
> -out_unlock:
> -	mutex_unlock(&rdtgroup_mutex);
> -
> -	return err;
> +	return 0;
>  }
>  
> -static void dom_data_exit(struct rdt_resource *r)
> +static void rmid_exit(struct rdt_resource *r)
>  {
>  	mutex_lock(&rdtgroup_mutex);
>  
> @@ -965,10 +961,6 @@ int resctrl_mon_l3_resource_init(void)

Please take a look at all functions modified to ensure function comments
are still accurate. For example, resctrl_mon_l3_resource_init() still
claims to manage rmid_ptrs[] after this change ...

>  	if (ret)
>  		return ret;
>  
> -	ret = dom_data_init(r);
> -	if (ret)
> -		return ret;
> -
>  	if (resctrl_arch_is_evt_configurable(QOS_L3_MBM_TOTAL_EVENT_ID)) {
>  		mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].configurable = true;
>  		resctrl_file_fflags_init("mbm_total_bytes_config",
> @@ -993,5 +985,5 @@ void resctrl_mon_l3_resource_exit(void)
>  	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>  
>  	closid_num_dirty_rmid_exit();
> -	dom_data_exit(r);
> +	rmid_exit(r);

Please do not call rmid_exit() from resctrl_mon_l3_resource_exit(). Doing so
breaks symmetry with resctrl_mon_l3_resource_init() which is especially
confusing with resctrl_mon_l3_resource_exit() called on failure exit 
from resctrl_mon_l3_resource_init(). 

It can just be called directly from resctrl_exit()?

>  }
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index b45f3d63c629..9e667d3a93ae 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -2599,6 +2599,12 @@ static int rdt_get_tree(struct fs_context *fc)
>  		goto out;
>  	}
>  
> +	if (resctrl_arch_mon_capable()) {
> +		ret = rmid_init();

The reference to "resctrl_init()" in comments seem quite stale now.

> +		if (ret)
> +			goto out;
> +	}
> +
>  	ret = rdtgroup_setup_root(ctx);
>  	if (ret)
>  		goto out;

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ