[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMgvDIdZ7dILeaS6@agluck-desk3>
Date: Mon, 15 Sep 2025 08:21:48 -0700
From: "Luck, Tony" <tony.luck@...el.com>
To: Fenghua Yu <fenghuay@...dia.com>, Reinette Chatre
<reinette.chatre@...el.com>, Maciej Wieczor-Retman
<maciej.wieczor-retman@...el.com>, Peter Newman <peternewman@...gle.com>,
James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>, "Drew
Fustini" <dfustini@...libre.com>, Dave Martin <Dave.Martin@....com>, Chen Yu
<yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v10 28/28] x86,fs/resctrl: Update Documentation for
package events
> + "activity", etc. The info/*/mon_features files provide the full
Need to escape that '*' to avoid:
Documentation/filesystems/resctrl.rst:526: WARNING: Inline emphasis start-string without end-string. [docutils]
Note that this had me scratching my head for a bit because the line
number in the warning points to this innocent line
526 Within each directory there is one file per event. For
two lines before the problem.
Updated patch below:
-Tony
>From 13a738760802370fc69414749847e12dced03868 Mon Sep 17 00:00:00 2001
From: Tony Luck <tony.luck@...el.com>
Date: Fri, 12 Sep 2025 13:43:02 -0700
Subject: [PATCH] x86,fs/resctrl: Update Documentation for package events
Update resctrl filesystem documentation with the details about the
resctrl files that support telemetry events.
Signed-off-by: Tony Luck <tony.luck@...el.com>
---
Documentation/filesystems/resctrl.rst | 100 ++++++++++++++++++++++----
1 file changed, 87 insertions(+), 13 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 006d23af66e1..cb6da9614f58 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -168,13 +168,12 @@ with respect to allocation:
bandwidth percentages are directly applied to
the threads running on the core
-If RDT monitoring is available there will be an "L3_MON" directory
+If L3 monitoring is available there will be an "L3_MON" directory
with the following files:
"num_rmids":
- The number of RMIDs available. This is the
- upper bound for how many "CTRL_MON" + "MON"
- groups can be created.
+ The number of RMIDs supported by hardware for
+ L3 monitoring events.
"mon_features":
Lists the monitoring events if
@@ -400,6 +399,19 @@ with the following files:
bytes) at which a previously used LLC_occupancy
counter can be considered for re-use.
+If telemetry monitoring is available there will be an "PERF_PKG_MON" directory
+with the following files:
+
+"num_rmids":
+ The number of RMIDs supported by hardware for
+ telemetry monitoring events.
+
+"mon_features":
+ Lists the telemetry monitoring events that are enabled on this system.
+
+The upper bound for how many "CTRL_MON" + "MON" can be created
+is the smaller of the L3_MON and PERF_PKG_MON "num_rmids" values.
+
Finally, in the top level of the "info" directory there is a file
named "last_cmd_status". This is reset with every "command" issued
via the file system (making new directories or writing to any of the
@@ -505,15 +517,40 @@ When control is enabled all CTRL_MON groups will also contain:
When monitoring is enabled all MON groups will also contain:
"mon_data":
- This contains a set of files organized by L3 domain and by
- RDT event. E.g. on a system with two L3 domains there will
- be subdirectories "mon_L3_00" and "mon_L3_01". Each of these
- directories have one file per event (e.g. "llc_occupancy",
- "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
- files provide a read out of the current value of the event for
- all tasks in the group. In CTRL_MON groups these files provide
- the sum for all tasks in the CTRL_MON group and all tasks in
- MON groups. Please see example section for more details on usage.
+ This contains directories for each monitor domain. One set for
+ each instance of an L3 cache, another set for each processor
+ package. The L3 cache directories are named "mon_L3_00",
+ "mon_L3_01" etc. The package directories "mon_PERF_PKG_00",
+ "mon_PERF_PKG_01" etc.
+
+ Within each directory there is one file per event. For
+ example the L3 directories may contain "llc_occupancy", "mbm_total_bytes",
+ and "mbm_local_bytes". The PERF_PKG directories may contain "core_energy",
+ "activity", etc. The info/`*`/mon_features files provide the full
+ list of event/file names.
+
+ "core energy" reports a floating point number for the energy (in Joules)
+ consumed by cores (registers, arithmetic units, TLB and L1/L2 caches)
+ during execution of instructions summed across all logical CPUs on a
+ package for the current RMID.
+
+ "activity" also reports a floating point value (in Farads).
+ This provides an estimate of work done independent of the
+ frequency that the CPUs used for execution.
+
+ Note that these two counters only measure energy/activity
+ in the "core" of the CPU (arithmetic units, TLB, L1 and L2
+ caches, etc.). They do not include L3 cache, memory, I/O
+ devices etc.
+
+ All other events report decimal integer values.
+
+ In a MON group these files provide a read out of the current
+ value of the event for all tasks in the group. In CTRL_MON groups
+ these files provide the sum for all tasks in the CTRL_MON group
+ and all tasks in MON groups. Please see example section for more
+ details on usage.
+
On systems with Sub-NUMA Cluster (SNC) enabled there are extra
directories for each node (located within the "mon_L3_XX" directory
for the L3 cache they occupy). These are named "mon_sub_L3_YY"
@@ -1506,6 +1543,43 @@ Example with C::
resctrl_release_lock(fd);
}
+Debugfs
+=======
+In addition to the use of debugfs for tracing of pseudo-locking
+performance, architecture code may create debugfs directories
+associated with monitoring features for a specific resource.
+
+The full pathname for these is in the form:
+
+ /sys/kernel/debug/resctrl/info/{resource_name}_MON/{arch}/
+
+The presence, names, and format of these files may vary
+between architectures even if the same resource is present.
+
+PERF_PKG_MON/x86_64
+-------------------
+Three files are present per telemetry aggregator instance
+that show status. The prefix of
+each file name describes the type ("energy" or "perf") which
+processor package it belongs to, and the instance number of
+the aggregator. For example: "energy_pkg1_agg2".
+
+The suffix describes which data is reported in the file and
+is one of:
+
+data_loss_count:
+ This counts the number of times that this aggregator
+ failed to accumulate a counter value supplied by a CPU.
+
+data_loss_timestamp:
+ This is a "timestamp" from a free running 25MHz uncore
+ timer indicating when the most recent data loss occurred.
+
+last_update_timestamp:
+ Another 25MHz timestamp indicating when the
+ most recent counter update was successfully applied.
+
+
Examples for RDT Monitoring along with allocation usage
=======================================================
Reading monitored data
--
2.51.0
Powered by blists - more mailing lists