lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b538ee53-67cb-426a-a748-16b0a81865de@intel.com>
Date: Tue, 16 Dec 2025 21:53:20 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Aaron Tomlin <atomlin@...mlin.com>, <tony.luck@...el.com>,
	<Dave.Martin@....com>, <james.morse@....com>, <babu.moger@....com>,
	<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>
CC: <sean@...e.io>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 3/3] x86/resctrl: Add "*" shorthand to set minimum
 io_alloc CBM for all domains

Hi Aaron,

On 12/15/25 3:02 PM, Aaron Tomlin wrote:
> This patch introduces a special domain ID selector "*" for io_alloc_cbm

Kernel code and patches to kernel have a couple of style requirements that
are documented. It takes some getting used to but it is required that submissions
follow the kernel style in order to be considered for inclusion.
Documentation relevant to above are:
Documentation/process/submitting-patches.rst: Consider whole document but specifically
search for "This patch" for information related to above.
Documentation/process/maintainer-tip.rst: Consider whole document but specifically
consider "Changelog" relevant to this part.

> that allows setting the CBM of each cache domain to its minimum number
> of consecutive bits in a single operation. For example, writing "*=0" to

Is there a reason to limit this implementation to only support the minimum CBM?
This feature can easily enable a user to set identical CBM across all domains
without the CBM being required to be the minimum CBM, no?

> /sys/fs/resctrl/info/L3/io_alloc_cbm programs each domain's CBM to the
> hardware-defined minimum, greatly simplifying automation and management
> tasks. The user is required to specify the correct minimum stored in
> /sys/fs/resctrl/info/L3/min_cbm_bits.
> 
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
>  Documentation/filesystems/resctrl.rst     |  10 ++
>  arch/x86/kernel/cpu/resctrl/core.c        |   2 +-
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  23 ++--
>  fs/resctrl/ctrlmondata.c                  | 141 ++++++++++++++++------
>  fs/resctrl/internal.h                     |  13 ++
>  fs/resctrl/rdtgroup.c                     |   2 +-
>  include/linux/resctrl.h                   |  30 ++++-
>  7 files changed, 174 insertions(+), 47 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 8c8ce678148a..9c128c8fba86 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -215,6 +215,16 @@ related to allocation:
>  			# cat /sys/fs/resctrl/info/L3/io_alloc_cbm
>  			0=00ff;1=000f
>  
> +		Set each CBM to their minimum number of consecutive bits.
> +
> +		A special value "*" is required to represent all cache IDs.
> +		The user should specify the correct minimum stored in
> +		/sys/fs/resctrl/info/L3/min_cbm_bits.
> +
> +		Example::
> +
> +			# echo "*=0" > /sys/fs/resctrl/info/L3/io_alloc_cbm
> +
>  		When CDP is enabled "io_alloc_cbm" associated with the CDP_DATA and CDP_CODE
>  		resources may reflect the same values. For example, values read from and
>  		written to /sys/fs/resctrl/info/L3DATA/io_alloc_cbm may be reflected by
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 3792ab4819dc..44aea6b534e0 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -276,7 +276,7 @@ static void rdt_get_cdp_config(int level)
>  
>  static void rdt_set_io_alloc_capable(struct rdt_resource *r)
>  {
> -	r->cache.io_alloc_capable = true;
> +	r->cache.io_alloc.io_alloc_capable = true;
>  }
>  
>  static void rdt_get_cdp_l3_config(void)
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index b20e705606b8..0f051d848422 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -57,14 +57,19 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
>  		hw_dom = resctrl_to_arch_ctrl_dom(d);
>  		msr_param.res = NULL;
>  		for (t = 0; t < CDP_NUM_TYPES; t++) {
> -			cfg = &hw_dom->d_resctrl.staged_config[t];
> -			if (!cfg->have_new_ctrl)
> -				continue;
> -
> -			idx = resctrl_get_config_index(closid, t);
> -			if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
> -				continue;
> -			hw_dom->ctrl_val[idx] = cfg->new_ctrl;
> +			if (resctrl_should_io_alloc_min_cbm(r)) {
> +				idx = resctrl_get_config_index(closid, t);
> +				hw_dom->ctrl_val[idx] = apply_io_alloc_min_cbm(r);
> +			} else {
> +				cfg = &hw_dom->d_resctrl.staged_config[t];
> +				if (!cfg->have_new_ctrl)
> +					continue;
> +
> +				idx = resctrl_get_config_index(closid, t);
> +				if (cfg->new_ctrl == hw_dom->ctrl_val[idx])
> +					continue;
> +				hw_dom->ctrl_val[idx] = cfg->new_ctrl;
> +			}
>  

No. resctrl_arch_update_domains() is a helper to just program configurations. Please avoid bleeding
feature specific handling into dedicated helpers. 

>  			if (!msr_param.res) {
>  				msr_param.low = idx;
> @@ -123,7 +128,7 @@ int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
>  {
>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>  
> -	if (hw_res->r_resctrl.cache.io_alloc_capable &&
> +	if (hw_res->r_resctrl.cache.io_alloc.io_alloc_capable &&
>  	    hw_res->sdciae_enabled != enable) {
>  		_resctrl_sdciae_enable(r, enable);
>  		hw_res->sdciae_enabled = enable;
> diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> index d2f9a4f2d887..a36b9055a1be 100644
> --- a/fs/resctrl/ctrlmondata.c
> +++ b/fs/resctrl/ctrlmondata.c
> @@ -688,7 +688,7 @@ int resctrl_io_alloc_show(struct kernfs_open_file *of, struct seq_file *seq, voi
>  
>  	mutex_lock(&rdtgroup_mutex);
>  
> -	if (r->cache.io_alloc_capable) {
> +	if (r->cache.io_alloc.io_alloc_capable) {
>  		if (resctrl_arch_get_io_alloc_enabled(r))
>  			seq_puts(seq, "enabled\n");
>  		else
> @@ -712,13 +712,31 @@ static bool resctrl_io_alloc_closid_supported(u32 io_alloc_closid)
>  	return io_alloc_closid < closids_supported();
>  }
>  
> +/**
> + * resctrl_sync_cdp_peer - Mirror staged configuration to the CDP peer type
> + *
> + * @s: resctrl schema
> + * @d: resctrl control domain
> + *
> + * The caller must ensure CDP is enabled for the resource and must be
> + * called under the cpu hotplug lock and rdtgroup mutex
> + */
> +static void resctrl_sync_cdp_peer(struct resctrl_schema *s, struct rdt_ctrl_domain *d)
> +{
> +	enum resctrl_conf_type peer_type;
> +
> +	peer_type = resctrl_peer_type(s->conf_type);
> +	memcpy(&d->staged_config[peer_type],
> +	       &d->staged_config[s->conf_type],
> +	       sizeof(d->staged_config[0]));
> +}
> +
>  /*
>   * Initialize io_alloc CLOSID cache resource CBM with all usable (shared
>   * and unused) cache portions.
>   */
>  static int resctrl_io_alloc_init_cbm(struct resctrl_schema *s, u32 closid)
>  {
> -	enum resctrl_conf_type peer_type;
>  	struct rdt_resource *r = s->res;
>  	struct rdt_ctrl_domain *d;
>  	int ret;
> @@ -731,11 +749,8 @@ static int resctrl_io_alloc_init_cbm(struct resctrl_schema *s, u32 closid)
>  
>  	/* Keep CDP_CODE and CDP_DATA of io_alloc CLOSID's CBM in sync. */
>  	if (resctrl_arch_get_cdp_enabled(r->rid)) {
> -		peer_type = resctrl_peer_type(s->conf_type);
>  		list_for_each_entry(d, &s->res->ctrl_domains, hdr.list)
> -			memcpy(&d->staged_config[peer_type],
> -			       &d->staged_config[s->conf_type],
> -			       sizeof(d->staged_config[0]));
> +			resctrl_sync_cdp_peer(s, d);
>  	}
>  
>  	ret = resctrl_arch_update_domains(r, closid);
> @@ -903,49 +918,103 @@ int resctrl_io_alloc_cbm_show(struct kernfs_open_file *of, struct seq_file *seq,
>  	return ret;
>  }
>  
> +/**
> + * parse_domain_cbm - Parse "domain=cbm" and return either side
> + *
> + * @line_ptr: Pointer to the input string
> + * @out_val: Output from parsed value (either domain ID or CDM)
> + * @which: Selector for which side to parse
> + *
> + * It is assumed that *line_ptr and *out_val are valid.
> + *
> + * Return: 0 on success and advance *line_ptr to point past the
> + * delimiter, EINVAL on parsing error
> + */
> +static inline int parse_domain_cbm(char **line_ptr, unsigned long *out_val,
> +				   enum resctrl_kv_sel which)
> +{
> +	char *rhs, *lhs, *tok;
> +	unsigned int base;
> +
> +	rhs = *line_ptr;
> +
> +	lhs = strsep(&rhs, "=");
> +	if (!lhs || !rhs)
> +		goto err;
> +
> +	if (which == RDT_PARSE_DOMAIN) {
> +		tok = lhs;
> +		base = 10;
> +	} else {
> +		tok = rhs;
> +		base = 16;
> +	}
> +	tok = strim(tok);
> +
> +	if (kstrtoul(tok, base, out_val))
> +		goto err;
> +
> +	*line_ptr = rhs;
> +	return 0;
> +err:
> +	rdt_last_cmd_puts("Invalid domain=cbm: missing '=' or non-numeric value\n");
> +	return -EINVAL;
> +}

Why is this new parser needed? 

> +
>  static int resctrl_io_alloc_parse_line(char *line,  struct rdt_resource *r,
>  				       struct resctrl_schema *s, u32 closid)
>  {
> -	enum resctrl_conf_type peer_type;
>  	struct rdt_parse_data data;
>  	struct rdt_ctrl_domain *d;
> -	char *dom = NULL, *id;
> -	unsigned long dom_id;
> +	char *cbm = NULL;
> +	unsigned long out_val;
> +	int ret;
>  
> -next:
> -	if (!line || line[0] == '\0')
> -		return 0;
> +	if (line[0] == '*') {
> +		ret = parse_domain_cbm(&line, &out_val, RDT_PARSE_CBM);
> +		if (ret)
> +			return ret;
>  
> -	dom = strsep(&line, ";");
> -	id = strsep(&dom, "=");
> -	if (!dom || kstrtoul(id, 10, &dom_id)) {
> -		rdt_last_cmd_puts("Missing '=' or non-numeric domain\n");
> +		if (out_val == r->cache.min_cbm_bits) {

This does not look right. If I understand correctly out_val contains the user
provided value/CBM, that is if user provided "*=3" then out_val contains "3". This value
is in turn compared against the *number of bits* required. First, if indeed just forcing
user to set the minimum CBM (which I already mentioned seems restrictive) then if out_val
is "3" and min_cbm_bits is "2" then this check will fail while the user indeed did set
the minimum CBM, no? Additionally, user providing value of 6 or C or any
other value with two consecutive bits set would also be a valid value as a "minimum", no?

This additional parsing and checks adds unnecessary complexity. Just keep original parsing
that relies on parse_cbm() that calls cbm_validate() that ensures the provided CBM is valid for
this resource while taking min_cbm_bits into account.


> +			r->cache.io_alloc.io_alloc_min_cbm = true;
> +			return 0;

If I understand correctly the parsing discards user provided value. If this check thinks it needs to
program the min CBM it appears to program the min CBM via a backdoor in the config write helper that
circumvents the flow used by everything else in resctrl that first stages the values and then calls
the config helper to write to hardware. This is not acceptable. Please integrate any new feature into
existing flows that follow existing patterns. Punching a feature specific back door via this new state
(io_alloc_min_cbm) is not appropriate.

> +		}
> +		rdt_last_cmd_puts("Invalid io_alloc min CBM\n");
>  		return -EINVAL;
>  	}
>  
> -	dom = strim(dom);
> -	list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> -		if (d->hdr.id == dom_id) {
> -			data.buf = dom;
> -			data.mode = RDT_MODE_SHAREABLE;
> -			data.closid = closid;
> -			if (parse_cbm(&data, s, d))
> -				return -EINVAL;
> -			/*
> -			 * Keep io_alloc CLOSID's CBM of CDP_CODE and CDP_DATA
> -			 * in sync.
> -			 */
> -			if (resctrl_arch_get_cdp_enabled(r->rid)) {
> -				peer_type = resctrl_peer_type(s->conf_type);
> -				memcpy(&d->staged_config[peer_type],
> -				       &d->staged_config[s->conf_type],
> -				       sizeof(d->staged_config[0]));
> +	while (line && line[0] != '\0') {
> +		bool found = false;
> +
> +		cbm = strsep(&line, ";");
> +		ret = parse_domain_cbm(&cbm, &out_val, RDT_PARSE_DOMAIN);
> +		if (ret)
> +			return ret;
> +
> +		cbm = strim(cbm);
> +		list_for_each_entry(d, &r->ctrl_domains, hdr.list) {
> +			if (d->hdr.id == out_val) {
> +				data.buf = cbm;
> +				data.mode = RDT_MODE_SHAREABLE;
> +				data.closid = closid;
> +				if (parse_cbm(&data, s, d))
> +					return -EINVAL;
> +				/*
> +				 * Keep io_alloc CLOSID's CBM of CDP_CODE and CDP_DATA
> +				 * in sync.
> +				 */
> +				if (resctrl_arch_get_cdp_enabled(r->rid))
> +					resctrl_sync_cdp_peer(s, d);
> +				found = true;
> +				break;
>  			}
> -			goto next;
>  		}
> +
> +		if (!found)
> +			return -EINVAL;
>  	}
>  
> -	return -EINVAL;
> +	return 0;
>  }

I agree with Babu [1]. This is a lot of complexity just to support handling of an additional character.
resctrl does a lot of parsing of per-domain user space input and the current implementation follows the
same pattern used throughout resctrl. Introducing a new pattern and parser unique to IO alloc feature adds
unnecessary complexity. Simple and consistent is better. I considered your response [2] but to me this
code is not more readable than a simple implementation built on top of existing, familiar, and consistent
patterns. Another motivation from [2] is that this is easier to maintain but I fail to see that. About the
motivation from [2] for "possible enhancements down the line" ... when/if need for an enhancement
is required then code can be refactored to support it.

On a workflow note: please allow for discussions about your submission. Taking a long time to respond and then
responding that you disagree with feedback immediately followed by a new version of the series that does not
address feedback is not productive.

Reinette

[1] https://lore.kernel.org/lkml/239e3a89-1075-439b-bf1c-d2be5be5c30c@amd.com/
[2] https://lore.kernel.org/lkml/5xlapgkp5bktan7xhy6l6b7c4qgeje7weu4cy6cbuux5npwijo@lhf5uvtuns5k/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ