[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8ee6c4e-7390-4313-9b34-6133186d2f7a@intel.com>
Date: Wed, 22 Oct 2025 21:47:51 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v12 19/31] x86/resctrl: Read telemetry events
Hi Tony,
On 10/13/25 3:33 PM, Tony Luck wrote:
> Telemetry events are enabled during the first mount of the resctrl
> file system.
Not clear how above is relevant to this patch?
>
> Mark telemetry regions that did not pass the sanity checks by
> clearing their MMIO address fields so that they will not be
> used when reading events.
Above is definitely not relevant.
>
> Introduce intel_aet_read_event() to read telemetry events for resource
> RDT_RESOURCE_PERF_PKG. There may be multiple aggregators tracking each
> package, so scan all of them and add up all counters. Aggregators may
> return an invalid data indication if they have received no records for
> a given RMID. Return success to the user if one or more aggregators
> provide valid data.
"success" (via return code of 0) is always returned to the user. The
difference is whether user will see event data, "Unavailable", or "Error".
>
> Resctrl now uses readq() so depends on X86_64. Update Kconfig.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
...
> @@ -230,3 +234,47 @@ void __exit intel_aet_exit(void)
> (*peg)->pfg = NULL;
> }
> }
> +
> +#define DATA_VALID BIT_ULL(63)
> +#define DATA_BITS GENMASK_ULL(62, 0)
> +
> +/*
> + * Read counter for an event on a domain (summing all aggregators
> + * on the domain). If an aggregator hasn't received any data for a
> + * specific RMID, the MMIO read indicates that data is not valid.
> + * Return success if at least one aggregator has valid data.
> + */
> +int intel_aet_read_event(int domid, u32 rmid, enum resctrl_event_id eventid,
> + void *arch_priv, u64 *val)
> +{
> + struct pmt_event *pevt = arch_priv;
> + struct event_group *e;
> + bool valid = false;
> + u64 evtcount;
> + void *pevt0;
> + u32 idx;
> +
> + pevt0 = pevt - pevt->idx;
> + e = container_of(pevt0, struct event_group, evts);
> + idx = rmid * e->num_events;
> + idx += pevt->idx;
> +
> + if (idx * sizeof(u64) + sizeof(u64) > e->mmio_size) {
> + pr_warn_once("MMIO index %u out of range\n", idx);
> + return -EIO;
> + }
> +
> + for (int i = 0; i < e->pfg->count; i++) {
> + if (!e->pfg->regions[i].addr)
> + continue;
> + if (e->pfg->regions[i].plat_info.package_id != domid)
> + continue;
> + evtcount = readq(e->pfg->regions[i].addr + idx * sizeof(u64));
> + if (!(evtcount & DATA_VALID))
> + continue;
> + *val += evtcount & DATA_BITS;
I missed this before. This just blindly adds the event data to whichever variable the
caller provides. Could you please instead use a local variable to do this addition and
on success assign it to the provided val parameter?
> + valid = true;
> + }
> +
> + return valid ? 0 : -EINVAL;
> +}
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 5565a8752266..6804b83934e8 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -252,6 +252,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
>
> resctrl_arch_rmid_read_context_check();
>
> + if (r->rid == RDT_RESOURCE_PERF_PKG)
> + return intel_aet_read_event(hdr->id, rmid, eventid, arch_priv, val);
> +
> if (!domain_header_is_valid(hdr, RESCTRL_MON_DOMAIN, RDT_RESOURCE_L3))
> return -EINVAL;
>
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index faaa2851c5c2..84a0cbe90748 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -526,12 +526,26 @@ static bool cpu_on_correct_domain(struct rmid_read *rr)
>
> static int __mon_event_count(struct rdtgroup *rdtgrp, struct rmid_read *rr)
> {
> + u64 tval = 0;
please declare this variable local to the PERF_PKG block to help keep the
different resource code separate.
> +
> if (!cpu_on_correct_domain(rr))
> return -EINVAL;
>
> switch (rr->r->rid) {
> case RDT_RESOURCE_L3:
> return __l3_mon_event_count(rdtgrp, rr);
> +
> + case RDT_RESOURCE_PERF_PKG:
> + rr->err = resctrl_arch_rmid_read(rr->r, rr->hdr, rdtgrp->closid,
> + rdtgrp->mon.rmid, rr->evt->evtid,
> + rr->evt->arch_priv,
> + &tval, rr->arch_mon_ctx);
> + if (rr->err)
> + return rr->err;
> +
> + rr->val += tval;
> +
> + return 0;
> default:
> return -EINVAL;
> }
Reinette
Powered by blists - more mailing lists