[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e65d2b54-e14c-4b1d-a9ea-04b6e0eb051f@intel.com>
Date: Tue, 13 Jan 2026 15:14:29 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Ben Horgan <ben.horgan@....com>
CC: <amitsinght@...vell.com>, <baisheng.gao@...soc.com>,
<baolin.wang@...ux.alibaba.com>, <carl@...amperecomputing.com>,
<dave.martin@....com>, <david@...nel.org>, <dfustini@...libre.com>,
<fenghuay@...dia.com>, <gshan@...hat.com>, <james.morse@....com>,
<jonathan.cameron@...wei.com>, <kobak@...dia.com>, <lcherian@...vell.com>,
<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<peternewman@...gle.com>, <punit.agrawal@....qualcomm.com>,
<quic_jiles@...cinc.com>, <rohit.mathew@....com>,
<scott@...amperecomputing.com>, <sdonthineni@...dia.com>,
<tan.shaopeng@...itsu.com>, <xhao@...ux.alibaba.com>,
<catalin.marinas@....com>, <will@...nel.org>, <corbet@....net>,
<maz@...nel.org>, <oupton@...nel.org>, <joey.gouly@....com>,
<suzuki.poulose@....com>, <kvmarm@...ts.linux.dev>
Subject: Re: [PATCH v3 28/47] arm_mpam: resctrl: Add support for csu counters
Hi Ben,
On 1/12/26 8:58 AM, Ben Horgan wrote:
> From: James Morse <james.morse@....com>
>
> resctrl exposes a counter via a file named llc_occupancy. This isn't really
> a counter as its value goes up and down, this is a snapshot of the cache
> storage usage monitor.
>
> Add some picking code to find a cache as close as possible to the L3 that
> supports the CSU monitor.
>
> If there is an L3, but it doesn't have any controls, force the L3 resource
> to exist. The existing topology_matches_l3() and
> mpam_resctrl_domain_hdr_init() code will ensure this looks like the L3,
> even if the class belongs to a later cache.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@...wei.com>
> Signed-off-by: James Morse <james.morse@....com>
> Co-developed-by: Dave Martin <dave.martin@....com>
> Signed-off-by: Dave Martin <dave.martin@....com>
> Signed-off-by: Ben Horgan <ben.horgan@....com>
> ---
> Changes since rfc:
> Allow csu counters however many partid or pmg there are
> else if -> if
> reduce scope of local variables
> drop has_csu
>
> Changes since v2:
> return -> break so works for mbwu in later patch
> add for_each_mpam_resctrl_mon
> return error from mpam_resctrl_monitor_init(). It may fail when is abmc
> allocation introduced in a later patch.
> Squashed in patch from Dave Martin:
> https://lore.kernel.org/lkml/20250820131621.54983-1-Dave.Martin@arm.com/
> ---
> drivers/resctrl/mpam_internal.h | 6 ++
> drivers/resctrl/mpam_resctrl.c | 173 +++++++++++++++++++++++++++++++-
> 2 files changed, 174 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
> index f89ceaf7623d..21cc776e57aa 100644
> --- a/drivers/resctrl/mpam_internal.h
> +++ b/drivers/resctrl/mpam_internal.h
> @@ -349,6 +349,12 @@ struct mpam_resctrl_res {
> struct rdt_resource resctrl_res;
> };
>
> +struct mpam_resctrl_mon {
> + struct mpam_class *class;
> +
> + /* per-class data that resctrl needs will live here */
> +};
> +
> static inline int mpam_alloc_csu_mon(struct mpam_class *class)
> {
> struct mpam_props *cprops = &class->props;
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index 7402bf4293b6..5020a5faed96 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -37,6 +37,21 @@ static struct mpam_resctrl_res mpam_resctrl_controls[RDT_NUM_RESOURCES];
> /* The lock for modifying resctrl's domain lists from cpuhp callbacks. */
> static DEFINE_MUTEX(domain_list_lock);
>
> +/*
> + * The classes we've picked to map to resctrl events.
> + * Resctrl believes all the worlds a Xeon, and these are all on the L3. This
> + * array lets us find the actual class backing the event counters. e.g.
> + * the only memory bandwidth counters may be on the memory controller, but to
> + * make use of them, we pretend they are on L3.
> + * Class pointer may be NULL.
> + */
> +static struct mpam_resctrl_mon mpam_resctrl_counters[QOS_NUM_EVENTS];
> +
> +#define for_each_mpam_resctrl_mon(mon, eventid) \
> + for (eventid = 0, mon = &mpam_resctrl_counters[eventid]; \
> + eventid < QOS_NUM_EVENTS; \
> + eventid++, mon = &mpam_resctrl_counters[eventid])
> +
Reading the above loop and how it is used to call mpam_resctrl_monitor_init() for every event
it looks like there is an implicit assumption that MPAM supports all events known to
resctrl.
Please consider the most recent resctrl feature "telemetry monitoring" currently queued
for inclusion: https://lore.kernel.org/lkml/20251217172121.12030-1-tony.luck@intel.com/
(You can find latest resctrl code queued for inclusion on the x86/cache branch of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git)
New telemetry monitoring introduces several new events known to resctrl. Specifically, here
is how enum resctrl_event_id looks at the moment:
/* Event IDs */
enum resctrl_event_id {
/* Must match value of first event below */
QOS_FIRST_EVENT = 0x01,
/*
* These values match those used to program IA32_QM_EVTSEL before
* reading IA32_QM_CTR on RDT systems.
*/
QOS_L3_OCCUP_EVENT_ID = 0x01,
QOS_L3_MBM_TOTAL_EVENT_ID = 0x02,
QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
/* Intel Telemetry Events */
PMT_EVENT_ENERGY,
PMT_EVENT_ACTIVITY,
PMT_EVENT_STALLS_LLC_HIT,
PMT_EVENT_C1_RES,
PMT_EVENT_UNHALTED_CORE_CYCLES,
PMT_EVENT_STALLS_LLC_MISS,
PMT_EVENT_AUTO_C6_RES,
PMT_EVENT_UNHALTED_REF_CYCLES,
PMT_EVENT_UOPS_RETIRED,
/* Must be the last */
QOS_NUM_EVENTS,
};
...
> @@ -582,6 +677,57 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
> return comp->comp_id;
> }
>
> +static int mpam_resctrl_monitor_init(struct mpam_resctrl_mon *mon,
> + enum resctrl_event_id type)
> +{
> + struct mpam_resctrl_res *res = &mpam_resctrl_controls[RDT_RESOURCE_L3];
> + struct rdt_resource *l3 = &res->resctrl_res;
> +
> + lockdep_assert_cpus_held();
> +
> + /* There also needs to be an L3 cache present */
> + if (get_cpu_cacheinfo_id(smp_processor_id(), 3) == -1)
> + return 0;
> +
> + /*
> + * If there are no MPAM resources on L3, force it into existence.
> + * topology_matches_l3() already ensures this looks like the L3.
> + * The domain-ids will be fixed up by mpam_resctrl_domain_hdr_init().
> + */
> + if (!res->class) {
> + pr_warn_once("Faking L3 MSC to enable counters.\n");
> + res->class = mpam_resctrl_counters[type].class;
> + }
> +
> + /* Called multiple times!, once per event type */
> + if (exposed_mon_capable) {
> + l3->mon_capable = true;
> +
> + /* Setting name is necessary on monitor only platforms */
> + l3->name = "L3";
> + l3->mon_scope = RESCTRL_L3_CACHE;
> +
> + resctrl_enable_mon_event(type);
btw, the telemetry work also changed this function prototype to be:
bool resctrl_enable_mon_event(enum resctrl_event_id eventid, bool any_cpu,
unsigned int binary_bits, void *arch_priv);
If I understand correctly resctrl_enable_mon_event() will be called for every event in
enum resctrl_event_id which now contains events that may not actually be supported. I think it
may be safer to be specific in which events MPAM wants to enable.
> +
> + /*
> + * Unfortunately, num_rmid doesn't mean anything for
> + * mpam, and its exposed to user-space!
> + *
The idea of adding a per MON group "num_mon_groups" file has been floated a couple of
times now. I have not heard any objections against doing something like this.
https://lore.kernel.org/all/cbe665c2-fe83-e446-1696-7115c0f9fd76@arm.com/
https://lore.kernel.org/lkml/46767ca7-1f1b-48e8-8ce6-be4b00d129f9@intel.com/
> + * num-rmid is supposed to mean the minimum number of
> + * monitoring groups that can exist simultaneously, including
> + * the default monitoring group for each control group.
> + *
> + * For mpam, each control group has its own pmg/rmid space, so
> + * it is not appropriate to advertise the whole rmid_idx space
> + * here. But the pmgs corresponding to the parent control
> + * group can be allocated freely:
> + */
> + l3->mon.num_rmid = mpam_pmg_max + 1;
> + }
> +
> + return 0;
> +}
> +
Reinette
Powered by blists - more mailing lists