lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d149e94-ab9d-4db7-abb6-83aaec1e6735@intel.com>
Date: Wed, 25 Jun 2025 16:38:41 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tony.luck@...el.com>,
	<Dave.Martin@....com>, <james.morse@....com>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <akpm@...ux-foundation.org>,
	<rostedt@...dmis.org>, <paulmck@...nel.org>, <thuth@...hat.com>,
	<ardb@...nel.org>, <gregkh@...uxfoundation.org>, <seanjc@...gle.com>,
	<thomas.lendacky@....com>, <pawan.kumar.gupta@...ux.intel.com>,
	<manali.shukla@....com>, <perry.yuan@....com>, <kai.huang@...el.com>,
	<peterz@...radead.org>, <xiaoyao.li@...el.com>, <kan.liang@...ux.intel.com>,
	<mario.limonciello@....com>, <xin3.li@...el.com>, <gautham.shenoy@....com>,
	<xin@...or.com>, <chang.seok.bae@...el.com>, <fenghuay@...dia.com>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<eranian@...gle.com>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v14 29/32] fs/resctrl: Introduce the interface to modify
 assignments in a group

Hi Babu,

On 6/13/25 2:05 PM, Babu Moger wrote:
> Introduce the interface to modify assignments within a group when

nit: This cannot be an introduction since the "interface" (resctrl file)
already exists at this point so this patch enables it to support modifications.
Perhaps:
"Enable the mbm_l3_assignments resctrl file to be used to modify counter
assignments of CTRL_MON and MON groups when the "mbm_event" counter
assignment mode is enabled." (Please feel free to improve)


> "mbm_event" mode is enabled.



> 
> The assignment modifications are done in the following format:
> <Event>:<Domain id>=<Assignment state>
> 
> Event: A valid MBM event in the
>        /sys/fs/resctrl/info/L3_MON/event_configs directory.
> 
> Domain ID: A valid domain ID. When writing, '*' applies the changes
> 	   to all domains.
> 
> Assignment states:
> 
>     _ : Unassign the counter.
> 
>     e : Assign the counter exclusively.
> 
> Examples:
> 
> $ cd /sys/fs/resctrl
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=e;1=e
>   mbm_local_bytes:0=e;1=e
> 
> To unassign the counter associated with the mbm_total_bytes event on
> domain 0:
> 
> $ echo "mbm_total_bytes:0=_" > mbm_L3_assignments
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=_;1=e
>   mbm_local_bytes:0=e;1=e
> 
> To unassign the counter associated with the mbm_total_bytes event on
> all the domains:
> 
> $ echo "mbm_total_bytes:*=_" > mbm_L3_assignments
> $ cat /sys/fs/resctrl/mbm_L3_assignments
>   mbm_total_bytes:0=_;1=_
>   mbm_local_bytes:0=e;1=e
> 
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---

...

> ---
>  Documentation/filesystems/resctrl.rst | 146 +++++++++++++++++++++++-
>  fs/resctrl/internal.h                 |   9 ++
>  fs/resctrl/rdtgroup.c                 | 157 +++++++++++++++++++++++++-
>  3 files changed, 310 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index a232a0b1356c..cd82c2966ed7 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -527,7 +527,8 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>  	Event: A valid MBM event in the
>  	       /sys/fs/resctrl/info/L3_MON/event_configs directory.
>  
> -	Domain ID: A valid domain ID.
> +	Domain ID: A valid domain ID. When writing, '*' applies the changes
> +		   to all domains.
>  
>  	Assignment states:
>  
> @@ -544,6 +545,34 @@ When the "mba_MBps" mount option is used all CTRL_MON groups will also contain:
>  	   mbm_total_bytes:0=e;1=e
>  	   mbm_local_bytes:0=e;1=e
>  
> +	Assignments can be modified by writing to the interface.
> +
> +	Example:
> +	To unassign the counter associated with the mbm_total_bytes event on domain 0:
> +	::
> +
> +	 # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=_;1=e
> +	   mbm_local_bytes:0=e;1=e
> +
> +	To unassign the counter associated with the mbm_total_bytes event on all the domains:
> +	::
> +
> +	 # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=_;1=_
> +	   mbm_local_bytes:0=e;1=e
> +
> +	To assign the counter associated with the mbm_total_bytes event on all domains in
> +	exclusive mode:
> +	::
> +
> +	 # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
> +	 # cat /sys/fs/resctrl/mbm_L3_assignments
> +	   mbm_total_bytes:0=e;1=e
> +	   mbm_local_bytes:0=e;1=e
> +
>  Resource allocation rules
>  -------------------------
>  
> @@ -1579,6 +1608,121 @@ View the llc occupancy snapshot::
>    # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
>    11234000
>  
> +
> +Examples on working with mbm_assign_mode
> +========================================
> +
> +a. Check if MBM assign support is available

"MBM assign support"? I do not think this term has been used so far.

> +::
> +
> +  #mount -t resctrl resctrl /sys/fs/resctrl/
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  [mbm_event]
> +  default
> +
> +mbm_event feature is detected and it is enabled.
> +
> +b. Check how many assignable counters are supported.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
> +  0=32;1=32
> +
> +c. Check how many assignable counters are available for assignment in each domain.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
> +  0=30;1=30
> +
> +d. To list the default group's assign states:
> +::
> +
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=e;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +e.  To unassign the counter associated with the mbm_total_bytes event on domain 0:
> +::
> +
> +  # echo "mbm_total_bytes:0=_" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=_;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +f. To unassign the counter associated with the mbm_total_bytes event on all domains:
> +::
> +
> +  # echo "mbm_total_bytes:*=_" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignment
> +  mbm_total_bytes:0=_;1=_
> +  mbm_local_bytes:0=e;1=e
> +
> +g. To assign a counter associated with the mbm_total_bytes event on all domains i
> +nexclusive mode:

"in exclusive"

> +::
> +
> +  # echo "mbm_total_bytes:*=e" > /sys/fs/resctrl/mbm_L3_assignments
> +  # cat /sys/fs/resctrl/mbm_L3_assignments
> +  mbm_total_bytes:0=e;1=e
> +  mbm_local_bytes:0=e;1=e
> +
> +h. Read the events mbm_total_bytes and mbm_local_bytes of the default group. There is
> +no change in reading the events with the assignment.  If the event is unassigned when
> +reading, then the read will come back as "Unassigned".
> +::
> +
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> +  779247936
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  765207488
> +
> +i. Check the default event configurations.
> +::
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
> +  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
> +  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +  local_reads,local_non_temporal_writes,local_reads_slow_memory
> +
> +j. Change the event configuration for mbm_local_bytes.
> +::
> +
> +  # echo "local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads" >
> +  /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> +
> +  # cat /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_local_bytes/event_filter
> +  local_reads, local_non_temporal_writes, local_reads_slow_memory, remote_reads

Please let output match code wrt spacing.

> +
> +This will update all (across all domains of all monitor groups) counter assignments
> +associated with the mbm_local_bytes event.
> +
> +k. Now read the local event again. The first read may come back with "Unavailable"
> +status. The subsequent read of mbm_local_bytes will display only the read events.

(note comment about "read events" on duplicate text)

> +::
> +
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  Unavailable
> +  # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> +  314101
> +
> +l. Users have the option to go back to 'default' mbm_assign_mode if required. This can be
> +done using the following command. Note that switching the mbm_assign_mode will reset all
> +the MBM counters (and thus all MBM events) of all the resctrl groups.

hmmm ... earlier documentation about mbm_assign_mode changes was careful to use
"may reset", and here is it switched to "will reset". I am still cautious to make any
strong commitments about resctrl behavior in user documentation.

> +::
> +
> +  # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> +  mbm_event
> +  [default]
> +
> +m. Unmount the resctrl
> +::
> +
> +  #umount /sys/fs/resctrl/
> +
>  Intel RDT Errata
>  ================
>  
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index ed0e3b695ad5..14d99c723ea5 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -51,6 +51,15 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc)
>  	return container_of(kfc, struct rdt_fs_context, kfc);
>  }
>  
> +/*
> + * Assignment types for the monitor modes
> + */
> +enum {
> +	ASSIGN_NONE		= 0,
> +	ASSIGN_EXCLUSIVE,
> +	ASSIGN_INVALID,
> +};

I do not think this is necessary (more below)

> +
>  /**
>   * struct mon_evt - Description of a monitor event
>   * @evtid:		event id
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 18ec65801dbb..92bb8f3adfae 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -2129,6 +2129,160 @@ static int mbm_L3_assignments_show(struct kernfs_open_file *of, struct seq_file
>  	return ret;
>  }
>  
> +/**
> + * mbm_get_mon_event_by_name() - Return the mon_evt entry for the matching
> + * event name.
> + */
> +static struct mon_evt *mbm_get_mon_event_by_name(struct rdt_resource *r,

struct rdt_resource parameter seems to be unused ... but should be used to match
with mon_evt::rid?

> +						 char *name)
> +{
> +	struct mon_evt *mevt;
> +
> +	for (mevt = &mon_event_all[0]; mevt < &mon_event_all[QOS_NUM_EVENTS]; mevt++) {

(use macro)

> +		if (mevt->enabled && !strcmp(mevt->name, name))
> +			return mevt;
> +	}
> +
> +	return NULL;
> +}

This looks to be a utility that should be close to the data structure in
fs/resctrl/monitor.c. Please check if you can move monitoring code
to fs/resctrl/monitor.c.

> +
> +static unsigned int resctrl_get_assign_state(char *assign)
> +{
> +	if (!assign || strlen(assign) != 1)
> +		return ASSIGN_INVALID;
> +
> +	switch (*assign) {
> +	case 'e':
> +		return ASSIGN_EXCLUSIVE;

I think this can be simplified by calling resctrl_assign_cntr_event()
(rdtgroup_assign_cntr_event()) directly.

> +	case '_':
> +		return ASSIGN_NONE;

Here resctrl_unassign_cntr_event() (rdtgroup_unassign_cntr_event())
can be called directly.

> +	default:
> +		return ASSIGN_INVALID;
> +	}

With assign/unassign done the function can return proper error

> +}
> +
> +static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgrp,
> +				  char *event, char *tok)
> +{
> +	struct rdt_mon_domain *d;
> +	unsigned long dom_id = 0;
> +	char *dom_str, *id_str;
> +	struct mon_evt *mevt;
> +	int assign_state;
> +	char domain[10];
> +	bool found;
> +	int ret;
> +
> +	mevt = mbm_get_mon_event_by_name(r, event);
> +	if (!mevt) {
> +		rdt_last_cmd_printf("Invalid event %s\n", event);
> +		return  -ENOENT;
> +	}
> +
> +next:
> +	if (!tok || tok[0] == '\0')
> +		return 0;
> +
> +	/* Start processing the strings for each domain */
> +	dom_str = strim(strsep(&tok, ";"));
> +
> +	id_str = strsep(&dom_str, "=");
> +
> +	/* Check for domain id '*' which means all domains */
> +	if (id_str && *id_str == '*') {
> +		d = NULL;
> +		goto check_state;

Instead of "goto check_state" resctrl_get_assign_state() (with
more appropriate name after changes) can be called directly, its
return handled, possibly printing to last_cmd_status without needing
any sprintf() tricks, and exit from resctrl_process_assign().

Apart from simplifying the code an additional benefit is to avoid
(ab)use case where user/bot may write:
   # echo "mbm_total_bytes:*=_;*=e;*=_" > /sys/fs/resctrl/mbm_L3_assignments

> +	} else if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
> +		rdt_last_cmd_puts("Missing domain id\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Verify if the dom_id is valid */
> +	found = false;
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		if (d->hdr.id == dom_id) {

Similarly, resctrl_get_assign_state() (new name TBD) can be
called directly and "found" can be dropped.

> +			found = true;
> +			break;
> +		}
> +	}
> +
> +	if (!found) {
> +		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
> +		return -EINVAL;
> +	}
> +
> +check_state:
> +	assign_state = resctrl_get_assign_state(dom_str);
> +
> +	switch (assign_state) {
> +	case ASSIGN_NONE:
> +		ret = resctrl_unassign_cntr_event(r, d, rdtgrp, mevt);
> +		break;
> +	case ASSIGN_EXCLUSIVE:
> +		ret = resctrl_assign_cntr_event(r, d, rdtgrp, mevt);
> +		break;
> +	case ASSIGN_INVALID:
> +		ret = -EINVAL;
> +	}
> +
> +	if (ret)
> +		goto out_fail;
> +
> +	goto next;
> +
> +out_fail:
> +	sprintf(domain, d ? "%ld" : "*", dom_id);
> +
> +	rdt_last_cmd_printf("Assign operation '%s:%s=%s' failed\n", event, domain, dom_str);
> +
> +	return ret;
> +}
> +
> +static ssize_t mbm_L3_assignments_write(struct kernfs_open_file *of, char *buf,
> +					size_t nbytes, loff_t off)
> +{
> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +	struct rdtgroup *rdtgrp;
> +	char *token, *event;
> +	int ret = 0;
> +
> +	/* Valid input requires a trailing newline */
> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
> +		return -EINVAL;
> +
> +	buf[nbytes - 1] = '\0';
> +
> +	rdtgrp = rdtgroup_kn_lock_live(of->kn);
> +	if (!rdtgrp) {
> +		rdtgroup_kn_unlock(of->kn);
> +		return -ENOENT;
> +	}
> +	rdt_last_cmd_clear();
> +
> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +		rdt_last_cmd_puts("mbm_event mode is not enabled\n");
> +		rdtgroup_kn_unlock(of->kn);
> +		return -EINVAL;
> +	}
> +
> +	while ((token = strsep(&buf, "\n")) != NULL) {
> +		/*
> +		 * The write command follows the following format:
> +		 * “<Event>:<Domain ID>=<Assignment state>”
> +		 * Extract the event name first.
> +		 */
> +		event = strsep(&token, ":");
> +
> +		ret = resctrl_process_assign(r, rdtgrp, event, token);
> +		if (ret)
> +			break;
> +	}
> +
> +	rdtgroup_kn_unlock(of->kn);
> +
> +	return ret ?: nbytes;
> +}
> +
>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{
> @@ -2269,9 +2423,10 @@ static struct rftype res_common_files[] = {
>  	},
>  	{
>  		.name		= "mbm_L3_assignments",
> -		.mode		= 0444,
> +		.mode		= 0644,
>  		.kf_ops		= &rdtgroup_kf_single_ops,
>  		.seq_show	= mbm_L3_assignments_show,
> +		.write		= mbm_L3_assignments_write,
>  	},
>  	{
>  		.name		= "mbm_assign_mode",

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ