[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bdeab410-df46-42a6-8dff-89e082178245@intel.com>
Date: Tue, 3 Jun 2025 21:02:29 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>, Anil Keshavamurthy <anil.s.keshavamurthy@...el.com>,
Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v5 21/29] x86/resctrl: x86/resctrl: Read core telemetry
events
Hi Tony,
shortlog has a duplicate "x86/resctrl"
On 5/21/25 3:50 PM, Tony Luck wrote:
> The resctrl file system passes requests to read event monitor files to
> the architecture resctrl_arch_rmid_read() to collect values
> from hardware counters.
>
> Use the resctrl resource to differentiate between calls to read legacy
> L3 events from the new telemetry events (which are attached to
> RDT_RESOURCE_PERF_PKG).
>
> There may be multiple aggregators tracking each package, so scan all of
> them and add up all counters.
>
> At run time when a user reads an event file the file system code
> provides the enum resctrl_event_id for the event.
>
> Create a lookup table indexed by event id to provide the telem_entry
> structure and the event index into MMIO space.
First time asking whether the lookup table is needed:
V3: https://lore.kernel.org/lkml/7bb97892-16fd-49c5-90f0-223526ebdf4c@intel.com/
Reminder in V4 that question about need for lookup table is still unanswered:
https://lore.kernel.org/lkml/54291845-a964-4d6a-948c-6d6b14a705dd@intel.com/
Here goes my third attempt:
I still feel that a new lookup table is unnecessary. Looking at the new
structure introduced it unnecessarily duplicates the idx value from
struct pmt_event. As I proposed before, what if struct mon_evt gets
a void *priv that the architecture can set during event enable? resctrl
fs can then provide this pointer back to arch code when user attempts
to read the event.
In this implementation it looks like this void *priv of an event could
point to the event's struct pmt_event entry. The only thing that is missing
is the struct event_group pointer. Looking at previous patch every struct
pmt_event has a sequential index that makes it possible to determine &evts[0]
from any of the struct pmt_event pointers, enabling the use of
container_of() to determine the struct event_group pointer. What do you think?
I surely may be missing something, when I do, please use it as a teaching moment
instead of ignoring me. I spend a lot of time studying your work with the
goal to provide useful feedback. For this feedback to just be ignored makes me
feel like I am wasting my time.
> Enable the events marked as readable from any CPU.
>
> Resctrl now uses readq() so depends on X86_64. Update Kconfig.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/intel_aet.c | 53 +++++++++++++++++++++++++
> arch/x86/kernel/cpu/resctrl/monitor.c | 6 +++
> arch/x86/Kconfig | 2 +-
> 4 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 2b2d4b5a4643..42da0a222c7c 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -169,5 +169,6 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>
> bool intel_aet_get_events(void);
> void __exit intel_aet_exit(void);
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id evtid, u64 *val);
>
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> index bf8e2a6126d2..be52c9302a80 100644
> --- a/arch/x86/kernel/cpu/resctrl/intel_aet.c
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -13,6 +13,7 @@
>
> #include <linux/cleanup.h>
> #include <linux/cpu.h>
> +#include <linux/io.h>
> #include <linux/resctrl.h>
> #include <linux/slab.h>
>
> @@ -128,6 +129,16 @@ static bool skip_this_region(struct telemetry_region *tr, struct event_group *e)
> return false;
> }
>
> +/**
> + * struct evtinfo - lookup table from resctrl_event_id to useful information
> + * @event_group: Pointer to the telem_entry structure for this event
My V4 question about "telem_entry structure" is still unanswered (note that changelog
also refers to "telem_entry structure):
https://lore.kernel.org/lkml/54291845-a964-4d6a-948c-6d6b14a705dd@intel.com/
After I apply this series I see:
$ git grep telem_entry
arch/x86/kernel/cpu/resctrl/intel_aet.c: * @event_group: Pointer to the telem_entry structure for this event
This is thus the only reference to "telem_entry" and I thus still
have the same question.
> + * @idx: Counter index within each per-RMID block of counters
> + */
> +static struct evtinfo {
> + struct event_group *event_group;
> + int idx;
> +} evtinfo[QOS_NUM_EVENTS];
> +
> static void free_mmio_info(struct mmio_info **mmi)
> {
> int num_pkgs = topology_max_packages();
> @@ -199,6 +210,15 @@ static int configure_events(struct event_group *e, struct pmt_feature_group *p)
> }
> e->pkginfo = no_free_ptr(pkginfo);
>
> + for (int i = 0; i < e->num_events; i++) {
> + enum resctrl_event_id evt;
> +
> + evt = e->evts[i].evtid;
> + evtinfo[evt].event_group = e;
> + evtinfo[evt].idx = e->evts[i].evt_idx;
> + resctrl_enable_mon_event(evt, true, e->evts[i].bin_bits);
> + }
> +
> return 0;
> }
>
> @@ -266,3 +286,36 @@ void __exit intel_aet_exit(void)
> free_mmio_info((*peg)->pkginfo);
> }
> }
> +
> +#define DATA_VALID BIT_ULL(63)
> +#define DATA_BITS GENMASK_ULL(62, 0)
> +
> +/*
> + * Read counter for an event on a domain (summing all aggregators
> + * on the domain).
> + */
> +int intel_aet_read_event(int domid, int rmid, enum resctrl_event_id evtid, u64 *val)
> +{
> + struct evtinfo *info = &evtinfo[evtid];
> + struct mmio_info *mmi;
> + u64 evtcount;
> + int idx;
> +
> + idx = rmid * info->event_group->num_events;
> + idx += info->idx;
> + mmi = info->event_group->pkginfo[domid];
> +
> + if (idx * sizeof(u64) + sizeof(u64) > info->event_group->mmio_size) {
> + pr_warn_once("MMIO index %d out of range\n", idx);
> + return -EIO;
> + }
> +
> + for (int i = 0; i < mmi->count; i++) {
> + evtcount = readq(mmi->addrs[i] + idx * sizeof(u64));
> + if (!(evtcount & DATA_VALID))
> + return -EINVAL;
> + *val += evtcount & DATA_BITS;
> + }
> +
> + return 0;
> +}
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 1f6dc253112f..c99aa9dacfd8 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -230,6 +230,12 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
>
> resctrl_arch_rmid_read_context_check();
>
> + if (r->rid == RDT_RESOURCE_PERF_PKG)
> + return intel_aet_read_event(d->hdr.id, rmid, eventid, val);
> +
This does not look right. As per the heading the function changed has the following signature:
int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
u32 unused, u32 rmid, enum resctrl_event_id eventid,
u64 *val, void *ignored)
The domain provided to the function is a pointer to a struct rdt_l3_mon_domain
so seeing this "r->rid == RDT_RESOURCE_PERF_PKG" test is unexpected because a
domain with type struct rdt_l3_mon_domain should not belong to a PERF_PKG resource.
Looks like that whole stack starting from rdtgroup_mondata_show() needs a second
look. Review of this work has not been going well and the skeptic in me is now
starting to think that the answer to my earlier question about why only a
subset of L3 resource specific functions are renamed is: "because if all L3
specific functions are renamed then it will be easier for reviewer to notice
when L3 specific functions are (ab)used for the PERF_PKG resource."
This extends to the data structures, the new events rely
on rdtgroup_mondata_show() to query the data and in turn rdtgroup_mondata_show()
relies on struct rmid_read that only has one domain pointer and it is to
struct rdt_l3_mon_domain.
> + if (r->rid != RDT_RESOURCE_L3)
> + return -EIO;
> +
> prmid = logical_rmid_to_physical_rmid(cpu, rmid);
> ret = __rmid_read_phys(prmid, eventid, &msr_val);
> if (ret)
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 52cfb69c343f..24df3f04a115 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -506,7 +506,7 @@ config X86_MPPARSE
>
> config X86_CPU_RESCTRL
> bool "x86 CPU resource control support"
> - depends on X86 && (CPU_SUP_INTEL || CPU_SUP_AMD)
> + depends on X86_64 && (CPU_SUP_INTEL || CPU_SUP_AMD)
> depends on MISC_FILESYSTEMS
> select ARCH_HAS_CPU_RESCTRL
> select RESCTRL_FS
Reinette
Powered by blists - more mailing lists