linux-kernel - Re: [PATCH v5 00/29] x86/resctrl telemetry monitoring

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1d6606fd-5047-4286-ac69-0dfe4de1b844@intel.com>
Date: Wed, 28 May 2025 15:21:42 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>
CC: Fenghua Yu <fenghuay@...dia.com>, Maciej Wieczor-Retman
	<maciej.wieczor-retman@...el.com>, Peter Newman <peternewman@...gle.com>,
	James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>, "Drew
 Fustini" <dfustini@...libre.com>, Dave Martin <Dave.Martin@....com>, "Anil
 Keshavamurthy" <anil.s.keshavamurthy@...el.com>, Chen Yu
	<yu.c.chen@...el.com>, <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v5 00/29] x86/resctrl telemetry monitoring

Hi Tony,

On 5/28/25 2:38 PM, Luck, Tony wrote:
> Hi Reinette,
> 
> I've begun drafting a new cover letter to explain telemetry.
> 
> Here's the introduction. Let me know if it helps cover the
> gaps and ambiguities that you pointed out.
> 
> -Tony
> 
> 
> RMID based telemetry events
> ---------------------------
> 
> Each CPU on a system keeps a local count of various events.
> 
> Every two milliseconds, or when the value of the RMID field in the
> IA32_PQR_ASSOC MSR is changed, the CPU transmits all the event counts
> together with the value of the RMID to a nearby OOBMSM (Out of band
> management services module) device. The CPU then resets all counters and
> begins counting events for the new RMID or time interval.
> 
> The OOBMSM device sums each event count with those received from other
> CPUs keeping a running total for each event for each RMID.
> 
> The operating system can read these counts to gather a picture of
> system-wide activity for each of the logged events per-RMID.
> 
> E.g. the operating system may assign RMID 5 to all the tasks running to
> perform a certain job. When it reads the core energy event counter for
> RMID 5 it will see the total energy consumed by CPU cores for all tasks
> in that job while running on any CPU. This is a much lower overhead
> mechanism to track events per job than the typical "perf" approach
> of reading counters on every context switch.
> 

Could you please elaborate the CPU vs core distinction?

If the example above is for a system with below topology (copied from 
Documentation/arch/x86/topology.rst):
                    
                                                                                
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0                    
			-> [thread 1] -> Linux CPU 1                    
	    -> [core 1] -> [thread 0] -> Linux CPU 2                    
			-> [thread 1] -> Linux CPU 3   

In the example, RMID 5 is assigned to tasks running "a certain job", for
convenience I will name it "jobA". Consider if the example is extended
with RMID 6 assigned to tasks running another job, "jobB".

If a jobA task is scheduled on CPU 0 and a jobB task is scheduled in CPU 1
then it may look like:
[package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 #RMID 5
			-> [thread 1] -> Linux CPU 1 #RMID 6            
	    -> [core 1] -> [thread 0] -> Linux CPU 2                    
			-> [thread 1] -> Linux CPU 3   

The example above states:
	When it reads the core energy event counter for RMID 5 it will
	see the total energy consumed by CPU cores for all tasks in that
	job while running on any CPU.

With RMID 5 and RMID 6 both running on core 0, and "RMID 5 will see
the total energy consumed by CPU cores", does this mean that reading RMID 5
counter will return the energy consumed by core 0 while RMID 5 is assigned to
CPU 0? Since core 0 contains both CPU 0 and CPU 1, would reading RMID 5 thus return
data of both RMID 5 and RMID 6 (jobA and jobB)?
And vice versa, reading RMID 6 will also include energy consumed by tasks
running with RMID 5?

Reinette