lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97d03a02-d110-48e9-9619-27a7596aa16a@intel.com>
Date: Thu, 14 Aug 2025 14:50:35 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
 Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
	<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
	<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
	<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 22/32] x86/resctrl: Read telemetry events

Hi Tony,

Subject: "x86/resctrl: Enable and read telemetry events"?

On 8/11/25 11:16 AM, Tony Luck wrote:
> Telemetry events are enumerated by the INTEL_PMT_TELEMETRY subsystem.

Above is the context but does not actually describe what this patch builds on.
Below is something to start working from:

	The active event groups are known after matching the known event groups
	with the system's telemetry events enumerated by the INTEL_PMT_TELEMETRY
	subsystem.

	Enable the active events in resctrl filesystem to make them available to
	user space. Pass a pointer to the pmt_event structure of the event within
	the struct event_group that resctrl stores in mon_evt::arch_priv. resctrl
	passes this pointer back when asking to read the event data which enables
	the data to be found in MMIO.

	...

	
> resctrl enables events with resctrl_enable_mon_event() passing a pointer
> to the pmt_event structure for the event within the struct event_group.
> The file system stores it in mon_evt::arch_priv.
> 
> Add a check to resctrl_arch_rmid_read() for resource id
> RDT_RESOURCE_PERF_PKG and directly call intel_aet_read_event()
> passing the enum resctrl_event_id for the event and the arch_priv
> pointer that was supplied when the event was enabled.
> 
> There may be multiple aggregators tracking each package, so scan all of
> them and add up all counters.

As mentioned below it is possible for some aggregators to not return valid data
and this is treated as a success. User will not be aware when this happens.
What is likelihood of this happening? Should user be made aware when this
happens?

> 
> Resctrl now uses readq() so depends on X86_64. Update Kconfig.
> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---

...

>  
> @@ -211,6 +212,9 @@ static int discover_events(struct event_group *e, struct pmt_feature_group *p)

Recurring feedback to this series is that in the beginning of the series
discover_events() gets function comments the describes the "steps" of discover ... but
later in series (like in this patch) when discover_events() is updated these comments/steps
are no longer updated.
Looking at the final work discover_events() thus has function comments with only two steps
that document part of what it does. 

I actually find the comments within the function that describes what an associated snippet does more
helpful than lumping everything at top of function in a list. For example, below change can get a 
comment like:
	/*
	 * Enable all events of active event group. Pass pointer to event's struct pmt_event
	 * as private data that resctrl fs includes when it requests to read the counter.
 	 */

>  
>  	list_add(&e->list, &active_event_groups);
>  
> +	for (int i = 0; i < e->num_events; i++)
> +		resctrl_enable_mon_event(e->evts[i].id, true, e->evts[i].bin_bits, &e->evts[i]);
> +
>  	return 0;
>  }
>  
> @@ -278,3 +282,43 @@ void __exit intel_aet_exit(void)
>  		list_del(&evg->list);
>  	}
>  }
> +
> +#define DATA_VALID	BIT_ULL(63)
> +#define DATA_BITS	GENMASK_ULL(62, 0)
> +
> +/*
> + * Read counter for an event on a domain (summing all aggregators
> + * on the domain).

Function comment can highlight that it is intentional that as long as
at least one aggregator returns valid data the read is considered a success
with the possibility that partial data may be returned to user space without
user being aware.

> + */
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id eventid,
> +			 void *arch_priv, u64 *val)
> +{
> +	struct pmt_event *pevt = arch_priv;
> +	struct pkg_mmio_info *mmi;
> +	struct event_group *e;
> +	bool valid = false;
> +	u64 evtcount;
> +	void *pevt0;
> +	int idx;
> +
> +	pevt0 = pevt - pevt->idx;
> +	e = container_of(pevt0, struct event_group, evts);
> +	idx = rmid * e->num_events;
> +	idx += pevt->idx;
> +	mmi = e->pkginfo[domid];
> +
> +	if (idx * sizeof(u64) + sizeof(u64) > e->mmio_size) {
> +		pr_warn_once("MMIO index %d out of range\n", idx);
> +		return -EIO;
> +	}
> +
> +	for (int i = 0; i < mmi->num_regions; i++) {
> +		evtcount = readq(mmi->addrs[i] + idx * sizeof(u64));
> +		if (!(evtcount & DATA_VALID))
> +			continue;
> +		*val += evtcount & DATA_BITS;
> +		valid = true;
> +	}
> +
> +	return valid ? 0 : -EINVAL;
> +}

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ