lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3ba783a-6387-4997-9e8c-897109ee3559@intel.com>
Date: Tue, 8 Jul 2025 13:50:39 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>
CC: Fenghua Yu <fenghuay@...dia.com>, Maciej Wieczor-Retman
	<maciej.wieczor-retman@...el.com>, Peter Newman <peternewman@...gle.com>,
	James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>, "Drew
 Fustini" <dfustini@...libre.com>, Dave Martin <Dave.Martin@....com>, "Anil
 Keshavamurthy" <anil.s.keshavamurthy@...el.com>, Chen Yu
	<yu.c.chen@...el.com>, <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v6 00/30] x86,fs/resctrl telemetry monitoring

Hi Tony,

On 6/30/25 3:46 PM, Luck, Tony wrote:
> On Mon, Jun 30, 2025 at 10:51:50AM -0700, Reinette Chatre wrote:
>>
>> Tony,
>>
>> On 6/26/25 9:49 AM, Tony Luck wrote:
>>> Background
>>> ----------
>>>
>>> Telemetry features are being implemented in conjunction with the
>>> IA32_PQR_ASSOC.RMID value on each logical CPU. This is used to send
>>> counts for various events to a collector in a nearby OOBMSM device to be
>>> accumulated with counts for each <RMID, event> pair received from other
>>> CPUs. Cores send event counts when the RMID value changes, or after each
>>> 2ms elapsed time.
>>
>> To start a review of this jumbo series and find that the *first* [1]
>> (straight forward) request from previous review has not been addressed is
>> demoralizing. I was hoping that the previous version's discussions would result
>> in review feedback either addressed or discussed (never ignored). I
>> cannot imagine how requesting OOBMSM to be expanded can be invalid though.
>>
>> Reinette
>>
>> [1] https://lore.kernel.org/lkml/b8ddce03-65c0-4420-b30d-e43c54943667@intel.com/
> 
> My profound apologies for blowing it (again). I went through the comments
> to patches multiple times to try and catch all your comments. But somehow
> skipped the cover letter :-( .
> 
> Here's a re-write to address comments, but also to try to provide
> a better story line starting with how the logical processors capture
> the event data, following on with aggregator processing, etc.
> 
> -Tony
> 
> ---
> 
> On Intel systems that support per-RMID telemetry monitoring each logical
> processor keeps a local count for various events. When the IA32_PQR_ASSOC.RMID
> value for the logical processor changes (or when a two millisecond counter
> expires) these event counts are transmitted to an event aggregator on
> the same package as the processor together with the current RMID value. The
> event counters are reset to zero to begin counting again.
> 
> Each aggregator takes the incoming event counts and adds them to
> cumulative counts for each event for each RMID. Note that there can be
> multiple aggregators on each package with no architectural association
> between logical processors and an aggregator.
> 
> All of these aggregated counters can be read by an operating system from
> the MMIO space of the Out Of Band Management Service Module (OOBMSM)
> device(s) on a system. Any counter can be read from any logical processor.
> 
> Intel publishes details for each processor generation showing which
> events are counted by each logical processor and the offsets for each
> accumulated counter value within the MMIO space in XML files here:
> https://github.com/intel/Intel-PMT.
> 
> For example there are two energy related telemetry events for the Clearwater
> Forest family of processors and the MMIO space looks like this:
> 
> Offset	RMID	Event
> ------	----	-----
> 0x0000	0	core_energy
> 0x0008	0	activity
> 0x0010	1	core_energy
> 0x0018	1	activity
> ...
> 0x23F0	575	core_energy
> 0x23F8	575	activity
> 
> In addition the XML file provides the units (Joules for core_energy,
> Farads for activity) and the type of data (fixed-point binary with
> bit 63 used as to indicate the data is valid, and the low 18 bits as a

"bit 63 used as to indicate" -> "bit 63 used to indicate"?

> binary fraction).
> 
> Finally, each XML file provides a 32-bit unique id (or guid) that is
> used as an index to find the correct XML description file for each
> telemetry implementation.
> 
> The INTEL_PMT_DISCOVERY driver provides intel_pmt_get_regions_by_feature()
> to enumerate the aggregator instances on a platform. It provides:

I think it will be helpful to prime the connection between "aggregator"
and "telemetery region" here. For example,

"to enumerate the aggregator instances on a platform" -> "to enumerate
the aggregator instances (also referred to as "telemetry regions" in this series)
on a platform"

> 1) guid  - so resctrl can determine which events are supported
> 2) mmio base address of counters

mmio -> MMIO

> 3) package id
> 
> Resctrl accumulates counts from all aggregators on a package in order
> to provide a consistent user interface across processor generations.
> 
> Directory structure for the telemetry events looks like this:
> 
> $ tree /sys/fs/resctrl/mon_data/
> /sys/fs/resctrl/mon_data/
> mon_data
> ├── mon_PERF_PKG_00
> │   ├── activity
> │   └── core_energy
> └── mon_PERF_PKG_01
>     ├── activity
>     └── core_energy
> 
> Reading the "core_energy" file from some resctrl mon_data directory shows
> the cumulative energy (in Joules) used by all tasks that ran with the RMID
> associated with that directory on a given package. Note that "core_energy"
> reports only energy consumed by CPU cores (data processing units,
> L1/L2 caches, etc.). It does not include energy used in the "uncore"
> (L3 cache, on package devices, etc.), or used by memory or I/O devices.

Thank you very much for this rework. I found this much easier to follow.

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ