[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1dbf47a3-dab3-4fbd-8146-9d4ce1787aa1@intel.com>
Date: Tue, 8 Jul 2025 16:51:45 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghuay@...dia.com>, "Maciej
Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>, Anil Keshavamurthy <anil.s.keshavamurthy@...el.com>,
Chen Yu <yu.c.chen@...el.com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v6 17/30] x86/resctrl: Discover hardware telemetry events
Hi Tony,
On 6/26/25 9:49 AM, Tony Luck wrote:
> Hardware has one or more telemetry event aggregators per package
> for each group of telemetry events. Each aggregator provides access
> to event counts in an array of 64-bit values in MMIO space. There
> is a "guid" (in this case a unique 32-bit integer) which refers to
> an XML file published in the https://github.com/intel/Intel-PMT
"an XML file published in the" -> "an XML file published in"?
> that provides all the details about each aggregator.
>
> The XML files provide the following information:
"The XML files provide" -> "The XML file provides"
(First paragraph refers to a single XML file so I assume it means
all this information is available from one XML file?)
> 1) Which telemetry events are included in the group for this aggregator.
> 2) The order in which the event counters appear for each RMID.
> 3) The value type of each event counter (integer or fixed-point).
> 4) The number of RMIDs supported.
> 5) Which additional aggregator status registers are included.
> 6) The total size of the MMIO region for this aggregator.
>
> Add select of X86_PLATFORM_DEVICES, INTEL_VSEC and
> INTEL_PMT_TELEMETRY to CONFIG_X86_CPU_RESCTRL to enable use of the
> discovery driver that enumerate all aggregators on the system with
> intel_pmt_get_regions_by_feature(). Call this for each pmt_feature_id
> that indicates per-RMID telemetry.
>
> Save the returned pmt_feature_group pointers with guids that are known
> to resctrl for use at run time.
>
> Those pointers are returned to the INTEL_PMT_DISCOVERY driver at
> resctrl_arch_exit() time.
It is not clear to me why this work needs to use two different terms for
the same thing.
Assuming it is required, up until this point this work has only used the
term "aggregator" and within this patch is where this work starts
using "telemetry region" interchangeably. When reading this patch
"telemetry region" is first used in struct event_group without the term
introduced to the reader.
Could you please add a snippet in changelog to help with this transition?
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 3 +
> arch/x86/kernel/cpu/resctrl/core.c | 5 +
> arch/x86/kernel/cpu/resctrl/intel_aet.c | 122 ++++++++++++++++++++++++
> arch/x86/Kconfig | 3 +
> arch/x86/kernel/cpu/resctrl/Makefile | 1 +
> 5 files changed, 134 insertions(+)
> create mode 100644 arch/x86/kernel/cpu/resctrl/intel_aet.c
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 224b71730cc3..e93b15bf6aab 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -169,4 +169,7 @@ void __init intel_rdt_mbm_apply_quirk(void);
>
> void rdt_domain_reconfigure_cdp(struct rdt_resource *r);
>
> +bool intel_aet_get_events(void);
> +void __exit intel_aet_exit(void);
> +
> #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index a5f01cac2363..9144766da836 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -734,6 +734,9 @@ void resctrl_arch_pre_mount(void)
>
> if (!atomic_try_cmpxchg(&only_once, &old, 1))
> return;
> +
> + if (!intel_aet_get_events())
> + return;
> }
>
> enum {
> @@ -1086,6 +1089,8 @@ late_initcall(resctrl_arch_late_init);
>
> static void __exit resctrl_arch_exit(void)
> {
> + intel_aet_exit();
> +
> cpuhp_remove_state(rdt_online);
>
> resctrl_exit();
> diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> new file mode 100644
> index 000000000000..b09044b093dd
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c
> @@ -0,0 +1,122 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Resource Director Technology(RDT)
> + * - Intel Application Energy Telemetry
> + *
> + * Copyright (C) 2025 Intel Corporation
> + *
> + * Author:
> + * Tony Luck <tony.luck@...el.com>
> + */
> +
> +#define pr_fmt(fmt) "resctrl: " fmt
> +
> +#include <linux/cleanup.h>
> +#include <linux/cpu.h>
> +#include <linux/intel_vsec.h>
> +#include <linux/resctrl.h>
> +
> +#include "internal.h"
> +
> +/**
> + * struct event_group - All information about a group of telemetry events.
> + * @pfg: Points to the aggregated telemetry space information
> + * within the OOBMSM driver that contains data for all
> + * telemetry regions.
> + * @guid: Unique number per XML description file.
> + */
> +struct event_group {
> + /* Data fields for additional structures to manage this group. */
> + struct pmt_feature_group *pfg;
> +
> + /* Remaining fields initialized from XML file. */
> + u32 guid;
> +};
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml
> + */
> +static struct event_group energy_0x26696143 = {
> + .guid = 0x26696143,
> +};
> +
> +/*
> + * Link: https://github.com/intel/Intel-PMT
> + * File: xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml
> + */
> +static struct event_group perf_0x26557651 = {
> + .guid = 0x26557651,
> +};
> +
> +static struct event_group *known_event_groups[] = {
> + &energy_0x26696143,
> + &perf_0x26557651,
> +};
> +
> +#define NUM_KNOWN_GROUPS ARRAY_SIZE(known_event_groups)
> +
> +/* Stub for now */
> +static int configure_events(struct event_group *e, struct pmt_feature_group *p)
> +{
> + return -EINVAL;
> +}
> +
> +DEFINE_FREE(intel_pmt_put_feature_group, struct pmt_feature_group *,
> + if (!IS_ERR_OR_NULL(_T))
> + intel_pmt_put_feature_group(_T))
As you state in cover this snippet cannot make checkpatch.pl happy.
I would propose that the issues trying to appease checkpatch.pl are
documented in the maintainer notes of this patch.
> +
> +/*
> + * Make a request to the INTEL_PMT_DISCOVERY driver for the
> + * pmt_feature_group for a specific feature. If there is
> + * one the returned structure has an array of telemetry_region
> + * structures. Each describes one telemetry aggregator.
> + * Try to configure any with a known matching guid.
I interpret "configure" to involve "write" activity where, for example,
settings are changed and things are ... configured. The way this is
written it seems that resctrl is configuring (i.e. making changes to)
events known to it. This is not the case though (?), resctrl is just
doing discovery of events here. How about above is instead:
Try to discover any with a known matching guid
and configure_events() -> discover_events()?
> + */
> +static bool get_pmt_feature(enum pmt_feature_id feature)
> +{
> + struct pmt_feature_group *p __free(intel_pmt_put_feature_group) = NULL;
> + struct event_group **peg;
> + bool ret;
> +
> + p = intel_pmt_get_regions_by_feature(feature);
> +
> + if (IS_ERR_OR_NULL(p))
> + return false;
> +
> + for (peg = &known_event_groups[0]; peg < &known_event_groups[NUM_KNOWN_GROUPS]; peg++) {
> + ret = configure_events(*peg, p);
> + if (!ret) {
> + (*peg)->pfg = no_free_ptr(p);
> + return true;
> + }
> + }
> +
> + return false;
> +}
> +
> +/*
> + * Ask OOBMSM discovery driver for all the RMID based telemetry groups
> + * that it supports.
> + */
> +bool intel_aet_get_events(void)
> +{
> + bool ret1, ret2;
> +
> + ret1 = get_pmt_feature(FEATURE_PER_RMID_ENERGY_TELEM);
> + ret2 = get_pmt_feature(FEATURE_PER_RMID_PERF_TELEM);
> +
> + return ret1 || ret2;
> +}
> +
> +void __exit intel_aet_exit(void)
> +{
> + struct event_group **peg;
> +
> + for (peg = &known_event_groups[0]; peg < &known_event_groups[NUM_KNOWN_GROUPS]; peg++) {
> + if ((*peg)->pfg) {
> + intel_pmt_put_feature_group((*peg)->pfg);
> + (*peg)->pfg = NULL;
> + }
> + }
> +}
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 71019b3b54ea..8eb68d2230be 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
Kconfig has its own thread.
Reinette
Powered by blists - more mailing lists