lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f11e8821-3d7c-4a05-baf0-14f87f9fe541@intel.com>
Date: Thu, 14 Aug 2025 15:01:16 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
 Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
	<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
	<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
	<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 32/32] x86,fs/resctrl: Update Documentation for package
 events

Hi Tony,

subject: "Documentation" -> "documentation"

On 8/11/25 11:17 AM, Tony Luck wrote:
> Each "mon_data" directory is now divided between L3 events and package
> events.
> 
> The "info/PERF_PKG_MON" directory contains parameters for perf events.

This changelog seems to be an incomplete and cryptic summary of the resctrl
interface changes made in this series. If the goal is to document the
new interfaces then please add a summary of these interfaces. Alternatively this
changelog may also be ok to be succinct like:
	"Update resctrl filesystem documentation with the details about the
	 resctrl files that support telemetry events."

> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
>  Documentation/filesystems/resctrl.rst | 85 +++++++++++++++++++++++----
>  1 file changed, 75 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index c7949dd44f2f..065f9fdd8f95 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -167,7 +167,7 @@ with respect to allocation:
>  			bandwidth percentages are directly applied to
>  			the threads running on the core
>  
> -If RDT monitoring is available there will be an "L3_MON" directory
> +If L3 monitoring is available there will be an "L3_MON" directory
>  with the following files:
>  
>  "num_rmids":

I expect that L3_MON's num_rmids also needs an update since it no longer
dictates the upper bound on for how many "CTRL_MON" + "MON" groups can be created.

> @@ -261,6 +261,18 @@ with the following files:
>  		bytes) at which a previously used LLC_occupancy
>  		counter can be considered for re-use.
>  
> +If telemetry monitoring is available there will be an "PERF_PKG_MON" directory
> +with the following files:
> +
> +"num_rmids":
> +		The number of telemetry RMIDs supported. If this is different
> +		from the number reported in the L3_MON directory the limit
> +		on the number of "CTRL_MON" + "MON" directories is the
> +		minimum of the values.
> +
> +"mon_features":
> +		Lists the telemetry monitoring events that are enabled on this system.
> +
>  Finally, in the top level of the "info" directory there is a file
>  named "last_cmd_status". This is reset with every "command" issued
>  via the file system (making new directories or writing to any of the
> @@ -366,15 +378,36 @@ When control is enabled all CTRL_MON groups will also contain:
>  When monitoring is enabled all MON groups will also contain:
>  
>  "mon_data":
> -	This contains a set of files organized by L3 domain and by
> -	RDT event. E.g. on a system with two L3 domains there will
> -	be subdirectories "mon_L3_00" and "mon_L3_01".	Each of these
> -	directories have one file per event (e.g. "llc_occupancy",
> -	"mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
> -	files provide a read out of the current value of the event for
> -	all tasks in the group. In CTRL_MON groups these files provide
> -	the sum for all tasks in the CTRL_MON group and all tasks in
> -	MON groups. Please see example section for more details on usage.
> +	This contains a set of directories, one for each instance
> +	of an L3 cache, or of a processor package. The L3 cache

A monitor group may have directories for L3 cache *and* processor packages, right?

> +	directories are named "mon_L3_00", "mon_L3_01" etc. The
> +	package directories "mon_PERF_PKG_00", "mon_PERF_PKG_01" etc.
> +
> +	Within each directory there is one file per event. In
> +	the L3 directories: "llc_occupancy", "mbm_total_bytes",
> +	and "mbm_local_bytes". In the PERF_PKG directories: "core_energy",
> +	"activity", etc.

I do not think this should be hardcoded as expected events. The original text
was careful to use "e.g" when mentioning the L3 events while new text seems to
make all three events as fact. I wonder if it may not be easier to read if
this refers to the resource's mon_features file to know which events can be
expected to be present and then the mon_features section is where all the intricate
details of events are documented. Compare with the L3 mon_features section.

> +
> +	"core_energy" reports a floating point number for the energy
> +	(in Joules) used by CPUs for each RMID.
> +
> +	"activity" also reports a floating point value (in Farads).
> +	This provides an estimate of work done independent of the
> +	frequency that the CPUs used for execution.
> +
> +	Note that these two counters only measure energy/activity
> +	in the "core" of the CPU (arithmetic units, TLB, L1 and L2
> +	caches, etc.). They do not include L3 cache, memory, I/O
> +	devices etc.
> +
> +	All other events report decimal integer values.
> +
> +	In a MON group these files provide a read out of the current
> +	value of the event for all tasks in the group. In CTRL_MON groups
> +	these files provide the sum for all tasks in the CTRL_MON group
> +	and all tasks in MON groups. Please see example section for more
> +	details on usage.
> +
>  	On systems with Sub-NUMA Cluster (SNC) enabled there are extra
>  	directories for each node (located within the "mon_L3_XX" directory
>  	for the L3 cache they occupy). These are named "mon_sub_L3_YY"
> @@ -1300,6 +1333,38 @@ Example with C::
>      resctrl_release_lock(fd);
>    }
>  
> +Debugfs
> +=======
> +In addition to the use of debugfs for tracing of pseudo-locking
> +performance, architecture code may create debugfs directories
> +associated with monitoring features for a specific resource.
> +
> +The full pathname for these is in the form:
> +
> +    /sys/kernel/debug/resctrl/info/{resource_name}_MON/{arch}/
> +
> +The prescence, names, and format of these files will vary

prescence -> presence 

Just to keep options open I would rather have this say "may vary"

> +between architectures even if the same resource is present.
> +
> +PERF_PKG_MON/x86_64
> +-------------------
> +Three files are present per telemetry aggregator instance
> +that show when and how often the hardware has failed to
> +collect and accumulate data from the CPUs.
> +
> +agg_data_loss_count:
> +	This counts the number of times that this aggregator
> +	failed to accumulate a counter value supplied by a CPU.
> +
> +agg_data_loss_timestamp:
> +	This is a "timestamp" from a free running 25MHz uncore
> +	timer indicating when the most recent data loss occurred.
> +
> +last_update_timestamp:
> +	Another 25MHz timestamp indicating when the
> +	most recent counter update was successfully applied.

Same comment wrt missing "agg" prefix.

> +
> +
>  Examples for RDT Monitoring along with allocation usage
>  =======================================================
>  Reading monitored data

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ