lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <99614342-b6ce-47ec-baf9-f5cdf42f77be@intel.com>
Date: Wed, 30 Jul 2025 13:04:54 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tony.luck@...el.com>,
	<james.morse@....com>, <tglx@...utronix.de>, <mingo@...hat.com>,
	<bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <Dave.Martin@....com>, <x86@...nel.org>, <hpa@...or.com>,
	<akpm@...ux-foundation.org>, <paulmck@...nel.org>, <rostedt@...dmis.org>,
	<Neeraj.Upadhyay@....com>, <david@...hat.com>, <arnd@...db.de>,
	<fvdl@...gle.com>, <seanjc@...gle.com>, <jpoimboe@...nel.org>,
	<pawan.kumar.gupta@...ux.intel.com>, <xin@...or.com>,
	<manali.shukla@....com>, <tao1.su@...ux.intel.com>, <sohil.mehta@...el.com>,
	<kai.huang@...el.com>, <xiaoyao.li@...el.com>, <peterz@...radead.org>,
	<xin3.li@...el.com>, <kan.liang@...ux.intel.com>,
	<mario.limonciello@....com>, <thomas.lendacky@....com>, <perry.yuan@....com>,
	<gautham.shenoy@....com>, <chang.seok.bae@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<peternewman@...gle.com>, <eranian@...gle.com>
Subject: Re: [PATCH v16 25/34] fs/resctrl: Add event configuration directory
 under info/L3_MON/

Hi Babu,

On 7/25/25 11:29 AM, Babu Moger wrote:


> ---
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index 4c24c5f3f4c1..3dfc177f9792 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -310,6 +310,38 @@ with the following files:
>  	  # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>  	  0=30;1=30
>  
> +"event_configs":
> +	Directory that exists when "mbm_event" counter assignment mode is supported.
> +	Contains sub-directory for each MBM event that can be assigned to a counter.

"Contains sub-directory" -> "Contains a sub-directory"?

> +
> +	Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
> +	Each MBM event's sub-directory contains a file named "event_filter" that is
> +	used to view and modify which memory transactions the MBM event is configured
> +	with.
> +
> +	List of memory transaction types supported:
> +
> +	==========================  ========================================================
> +	Name			    Description
> +	==========================  ========================================================
> +	dirty_victim_writes_all     Dirty Victims from the QOS domain to all types of memory
> +	remote_reads_slow_memory    Reads to slow memory in the non-local NUMA domain
> +	local_reads_slow_memory     Reads to slow memory in the local NUMA domain
> +	remote_non_temporal_writes  Non-temporal writes to non-local NUMA domain
> +	local_non_temporal_writes   Non-temporal writes to local NUMA domain
> +	remote_reads                Reads to memory in the non-local NUMA domain
> +	local_reads                 Reads to memory in the local NUMA domain
> +	==========================  ========================================================
> +
> +	For example::
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
> +	  local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
> +	  local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
> +
> +	  # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
> +	  local_reads,local_non_temporal_writes,local_reads_slow_memory
> +
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
>  		bytes) at which a previously used LLC_occupancy

...

> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 16bcfeeb89e6..fa5f63126682 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
>  	{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
>  };
>  
> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
> +{
> +	struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
> +	bool sep = false;
> +	int i;
> +
> +	mutex_lock(&rdtgroup_mutex);
> +

There is inconsistency among the files introduced on how
"mbm_event mode disabled" case is handled. Some files return failure
from their _show()/_write() when "mbm_event mode is disabled", some don't. 

The "event_filter" file always prints the MBM transactions monitored
when assignable counters are supported, whether mbm_event mode is enabled
or not. This means that the MBM event's configuration values are printed
when "default" mode is enabled.  I have two concerns about this
1) This is potentially very confusing since switching to "default" will
   make the BMEC files visible that will enable the user to modify the
   event configurations per domain. Having this file print a global event
   configuration while there are potentially various different domain-specific
   configuration active will be confusing.
2) Can it be guaranteed that the MBM events will monitor the default
   assignable counter memory transactions when in "default" mode? It has
   never been possible to query which memory transactions are monitored by
   the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features
   so this seems to use one feature to deduce capabilities or another?



> +	for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
> +		if (mevt->evt_cfg & mbm_transactions[i].val) {
> +			if (sep)
> +				seq_putc(seq, ',');
> +			seq_printf(seq, "%s", mbm_transactions[i].name);
> +			sep = true;
> +		}
> +	}
> +	seq_putc(seq, '\n');
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +
> +	return 0;
> +}
> +
>  /**
>   * resctrl_mon_resource_init() - Initialise global monitoring structures.
>   *
> @@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void)
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
>  		resctrl_file_fflags_init("available_mbm_cntrs",
>  					 RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
> +		resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
>  	}
>  
>  	return 0;

...

> @@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
>  		return ret;
>  
>  	ret = rdtgroup_add_files(kn_subdir, fflags);
> +	if (ret)
> +		return ret;
> +
> +	if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) {
> +		r = priv;
> +		if (r->mon.mbm_cntr_assignable) {
> +			ret = resctrl_mkdir_event_configs(r, kn_subdir);
> +			if (ret)
> +				return ret;
> +		}
> +	}
> +
>  	if (!ret)
>  		kernfs_activate(kn_subdir);
>  

Looks like the "if (!ret)" above can be dropped to always call "kernfs_activate(kn_subdir)"
on exit making it clear that this is success path and function exits early on any error.

Reinette




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ