[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97d03a02-d110-48e9-9619-27a7596aa16a@intel.com>
Date: Thu, 14 Aug 2025 14:50:35 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 22/32] x86/resctrl: Read telemetry events
Hi Tony,
Subject: "x86/resctrl: Enable and read telemetry events"?
On 8/11/25 11:16 AM, Tony Luck wrote:
> Telemetry events are enumerated by the INTEL_PMT_TELEMETRY subsystem.
Above is the context but does not actually describe what this patch builds on.
Below is something to start working from:
The active event groups are known after matching the known event groups
with the system's telemetry events enumerated by the INTEL_PMT_TELEMETRY
subsystem.
Enable the active events in resctrl filesystem to make them available to
user space. Pass a pointer to the pmt_event structure of the event within
the struct event_group that resctrl stores in mon_evt::arch_priv. resctrl
passes this pointer back when asking to read the event data which enables
the data to be found in MMIO.
...
> resctrl enables events with resctrl_enable_mon_event() passing a pointer
> to the pmt_event structure for the event within the struct event_group.
> The file system stores it in mon_evt::arch_priv.
>
> Add a check to resctrl_arch_rmid_read() for resource id
> RDT_RESOURCE_PERF_PKG and directly call intel_aet_read_event()
> passing the enum resctrl_event_id for the event and the arch_priv
> pointer that was supplied when the event was enabled.
>
> There may be multiple aggregators tracking each package, so scan all of
> them and add up all counters.
As mentioned below it is possible for some aggregators to not return valid data
and this is treated as a success. User will not be aware when this happens.
What is likelihood of this happening? Should user be made aware when this
happens?
>
> Resctrl now uses readq() so depends on X86_64. Update Kconfig.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
...
>
> @@ -211,6 +212,9 @@ static int discover_events(struct event_group *e, struct pmt_feature_group *p)
Recurring feedback to this series is that in the beginning of the series
discover_events() gets function comments the describes the "steps" of discover ... but
later in series (like in this patch) when discover_events() is updated these comments/steps
are no longer updated.
Looking at the final work discover_events() thus has function comments with only two steps
that document part of what it does.
I actually find the comments within the function that describes what an associated snippet does more
helpful than lumping everything at top of function in a list. For example, below change can get a
comment like:
/*
* Enable all events of active event group. Pass pointer to event's struct pmt_event
* as private data that resctrl fs includes when it requests to read the counter.
*/
>
> list_add(&e->list, &active_event_groups);
>
> + for (int i = 0; i < e->num_events; i++)
> + resctrl_enable_mon_event(e->evts[i].id, true, e->evts[i].bin_bits, &e->evts[i]);
> +
> return 0;
> }
>
> @@ -278,3 +282,43 @@ void __exit intel_aet_exit(void)
> list_del(&evg->list);
> }
> }
> +
> +#define DATA_VALID BIT_ULL(63)
> +#define DATA_BITS GENMASK_ULL(62, 0)
> +
> +/*
> + * Read counter for an event on a domain (summing all aggregators
> + * on the domain).
Function comment can highlight that it is intentional that as long as
at least one aggregator returns valid data the read is considered a success
with the possibility that partial data may be returned to user space without
user being aware.
> + */
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id eventid,
> + void *arch_priv, u64 *val)
> +{
> + struct pmt_event *pevt = arch_priv;
> + struct pkg_mmio_info *mmi;
> + struct event_group *e;
> + bool valid = false;
> + u64 evtcount;
> + void *pevt0;
> + int idx;
> +
> + pevt0 = pevt - pevt->idx;
> + e = container_of(pevt0, struct event_group, evts);
> + idx = rmid * e->num_events;
> + idx += pevt->idx;
> + mmi = e->pkginfo[domid];
> +
> + if (idx * sizeof(u64) + sizeof(u64) > e->mmio_size) {
> + pr_warn_once("MMIO index %d out of range\n", idx);
> + return -EIO;
> + }
> +
> + for (int i = 0; i < mmi->num_regions; i++) {
> + evtcount = readq(mmi->addrs[i] + idx * sizeof(u64));
> + if (!(evtcount & DATA_VALID))
> + continue;
> + *val += evtcount & DATA_BITS;
> + valid = true;
> + }
> +
> + return valid ? 0 : -EINVAL;
> +}
Reinette
Powered by blists - more mailing lists