[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9d355fc-cba9-4d6f-9b98-1cbdfdeb58f7@intel.com>
Date: Fri, 25 Jul 2025 16:39:19 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>, Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v7 17/31] x86/resctrl: Discover hardware telemetry events
Hi Tony,
On 7/11/25 4:53 PM, Tony Luck wrote:
> Hardware has one or more telemetry event aggregators per package
> for each group of telemetry events. Each aggregator provides access
>From what I can tell this is the first mention of "group of telemetry
events" yet reader is assumed to know what it represents.
Where is this concept introduced?
> to event counts in an array of 64-bit values in MMIO space. There
> is a "guid" (in this case a unique 32-bit integer) which refers to
> an XML file published in https://github.com/intel/Intel-PMT
> that provides all the details about each aggregator.
>
> The XML file provides the following information:
> 1) Which telemetry events are included in the group for this aggregator.
Regarding "in the group for this aggregator" ... does this mean that
each aggregator can have different events from the "group"?
> 2) The order in which the event counters appear for each RMID.
> 3) The value type of each event counter (integer or fixed-point).
> 4) The number of RMIDs supported.
> 5) Which additional aggregator status registers are included.
> 6) The total size of the MMIO region for this aggregator.
Does this mean that aggregators belonging to an event group
can have different MMIO sizes? (I expect this ties in with (1))
So, specifically, the "for this aggregator" implies that these properties
can differ between aggregators belonging to the same "event group" so reader
will be looking out for how the code handles this ... but it doesn't (yet
can only be seen in later patches).
> Each aggregator makes event counters available to Linux in
> a region of MMIO memory. Enumeration of these regions is
> done by the INTEL_PMT_DISCOVERY discovery driver.
>
> Add a new Kconfig option CONFIG_X86_RESCTRL_CPU_INTEL_AET for the
I proposed same namespace prefix before, yet this keeps being different.
Why not just be consistent with exiting CONFIG_X86_CPU_RESCTRL?
> Intel specific parts of telemetry code. This depends on the
> INTEL_PMT_DISCOVERY driver being built-in to the kernel for
> enumeration of telemetry features.
hmmm ... attempting to build with these dependencies met result in:
ld: vmlinux.o: in function `get_pmt_feature':
SNIP/linux/arch/x86/kernel/cpu/resctrl/intel_aet.c:292:(.text+0x6f4e9): undefined reference to `intel_pmt_get_regions_by_feature'
ld: vmlinux.o: in function `__free_intel_pmt_put_feature_group':
SNIP/linux/arch/x86/kernel/cpu/resctrl/intel_aet.c:274:(.text+0x6f708): undefined reference to `intel_pmt_put_feature_group'
ld: vmlinux.o: in function `intel_aet_exit':
SNIP/linux/arch/x86/kernel/cpu/resctrl/intel_aet.c:382:(.exit.text+0x286): undefined reference to `intel_pmt_put_feature_group'
Looks like the dependency should be INTEL_PMT_TELEMETRY instead?
>
> Call intel_pmt_get_regions_by_feature() for each pmt_feature_id
> that indicates per-RMID telemetry.
>
> Save the returned pmt_feature_group pointers with guids that are known
> to resctrl for use at run time.
>
> Those pointers are returned to the INTEL_PMT_DISCOVERY driver at
INTEL_PMT_TELEMETRY?
> resctrl_arch_exit() time.
>
What follows is not appropriate for the merged changelog and is instead material for
the "maintainer notes" that is below the "---"
> Note that checkpatch complains about the alignment of additional
> lines in the definition of the intel_pmt_put_feature_group
> cleanup helper. I didn't find a way to appease conflicting
> requirements from checkpatch.
This just mentions one checkpatch complaint. Could you please expand this to
mention what the conflicting checkpatch.pl requirements are?
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 8 ++
> arch/x86/kernel/cpu/resctrl/core.c | 5 +
> arch/x86/kernel/cpu/resctrl/intel_aet.c | 133 ++++++++++++++++++++++++
> arch/x86/Kconfig | 13 +++
> arch/x86/kernel/cpu/resctrl/Makefile | 1 +
> 5 files changed, 160 insertions(+)
> create mode 100644 arch/x86/kernel/cpu/resctrl/intel_aet.c
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 684a1b830ced..36a2072c19c7 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -169,4 +169,12 @@ void __init intel_rdt_mbm_apply_quirk(void);
>
> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>
> +#ifdef CONFIG_X86_RESCTRL_CPU_INTEL_AET
> +bool intel_aet_get_events(void);
> +void __exit intel_aet_exit(void);
> +#else
> +static inline bool intel_aet_get_events(void) { return false; }
> +static inline void __exit intel_aet_exit(void) { }
> +#endif
> +
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 2b2f76c76d73..b8288f5d4aff 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -734,6 +734,9 @@ void resctrl_arch_pre_mount(void)
>
> if (!atomic_try_cmpxchg(&only_once, &old, 1))
> return;
> +
> + if (!intel_aet_get_events())
> + return;
> }
>
> enum {
> @@ -1086,6 +1089,8 @@ late_initcall(resctrl_arch_late_init);
>
> static void __exit resctrl_arch_exit(void)
> {
> + intel_aet_exit();
> +
> cpuhp_remove_state(rdt_online);
>
> resctrl_exit();
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> new file mode 100644
> index 000000000000..d177e5aa1f6a
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -0,0 +1,133 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Resource Director Technology(RDT)
> + * - Intel Application Energy Telemetry
> + *
> + * Copyright (C) 2025 Intel Corporation
> + *
> + * Author:
> + * Tony Luck <tony.luck@...el.com>
> + */
> +
> +#define pr_fmt(fmt) "resctrl: " fmt
> +
> +#include <linux/cleanup.h>
> +#include <linux/cpu.h>
> +#include <linux/intel_vsec.h>
> +#include <linux/resctrl.h>
> +
> +#include "internal.h"
> +
> +/**
> + * struct event_group - All information about a group of telemetry events.
> + * @pfg: Points to the aggregated telemetry space information
> + * within the OOBMSM driver that contains data for all
> + * telemetry regions.
> + * @list: List of active event groups.
How about "Member of active_event_groups."? (although unclear how this
list is used at this point ... may be more appropriate for a later patch)
> + * @guid: Unique number per XML description file.
> + */
> +struct event_group {
> + /* Data fields for additional structures to manage this group. */
> + struct pmt_feature_group *pfg;
> + struct list_head list;
> +
> + /* Remaining fields initialized from XML file. */
> + u32 guid;
> +};
> +
> +static LIST_HEAD(active_event_groups);
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml
> + */
> +static struct event_group energy_0x26696143 = {
> + .guid = 0x26696143,
> +};
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml
> + */
> +static struct event_group perf_0x26557651 = {
> + .guid = 0x26557651,
> +};
> +
> +static struct event_group *known_energy_event_groups[] = {
> + &energy_0x26696143,
> +};
> +
> +#define NUM_KNOWN_ENERGY_GROUPS ARRAY_SIZE(known_energy_event_groups)
Why is this macro needed? I think the code will be easier to understand if this
"ARRAY_SIZE" is open coded instead of using this macro.
> +
> +static struct event_group *known_perf_event_groups[] = {
> + &perf_0x26557651,
> +};
> +
> +#define NUM_KNOWN_PERF_GROUPS ARRAY_SIZE(known_perf_event_groups)
Same wrt macro.
> +
> +/* Stub for now */
> +static int discover_events(struct event_group *e, struct pmt_feature_group *p)
> +{
> + return -EINVAL;
> +}
> +
> +DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
> + if (!IS_ERR_OR_NULL(_T))
> + intel_pmt_put_feature_group(_T))
> +
> +/*
> + * Make a request to the INTEL_PMT_DISCOVERY driver for the
> + * pmt_feature_group for a specific feature. If there is
> + * one the returned structure has an array of telemetry_region
> + * structures. Each describes one telemetry aggregator.
> + * Try to use any with a known matching guid.
"Try to use every telemetry aggregator with a known guid."?
> + */
> +static bool get_pmt_feature(enum pmt_feature_id feature, struct event_group **evgs,
> + unsigned int num_evg)
> +{
> + struct pmt_feature_group *p __free(intel_pmt_put_feature_group) = NULL;
> + struct event_group **peg;
> + bool ret;
"ret" is of type bool but is only used for return value of discover_events() that
returns an int? So when discover_events() return -EINVAL ...?
> +
> + p = intel_pmt_get_regions_by_feature(feature);
> +
> + if (IS_ERR_OR_NULL(p))
> + return false;
> +
> + for (peg = evgs; peg < &evgs[num_evg]; peg++) {
> + ret = discover_events(*peg, p);
> + if (!ret) {
> + (*peg)->pfg = no_free_ptr(p);
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +/*
> + * Ask OOBMSM discovery driver for all the RMID based telemetry groups
> + * that it supports.
> + */
> +bool intel_aet_get_events(void)
> +{
> + bool ret1, ret2;
> +
> + ret1 = get_pmt_feature(FEATURE_PER_RMID_ENERGY_TELEM,
> + known_energy_event_groups, NUM_KNOWN_ENERGY_GROUPS);
Just call ARRAY_SIZE() directly here? I think it will make the code easier to understand.
> + ret2 = get_pmt_feature(FEATURE_PER_RMID_PERF_TELEM,
> + known_perf_event_groups, NUM_KNOWN_PERF_GROUPS);
Same here.
> +
> + return ret1 || ret2;
> +}
> +
> +void __exit intel_aet_exit(void)
> +{
> + struct event_group *evg, *tmp;
> +
> + list_for_each_entry_safe(evg, tmp, &active_event_groups, list) {
This cleanup does not match initialization done in this patch.
> + intel_pmt_put_feature_group(evg->pfg);
> + evg->pfg = NULL;
> + list_del(&evg->list);
> + }
> +}
Reinette
Powered by blists - more mailing lists