[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b38056d-c0f3-4e28-87e4-413225fee91e@intel.com>
Date: Thu, 22 May 2025 15:41:53 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <tony.luck@...el.com>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>
CC: <james.morse@....com>, <dave.martin@....com>, <fenghuay@...dia.com>,
<x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<akpm@...ux-foundation.org>, <thuth@...hat.com>, <rostedt@...dmis.org>,
<ardb@...nel.org>, <gregkh@...uxfoundation.org>,
<daniel.sneddon@...ux.intel.com>, <jpoimboe@...nel.org>,
<alexandre.chartre@...cle.com>, <pawan.kumar.gupta@...ux.intel.com>,
<thomas.lendacky@....com>, <perry.yuan@....com>, <seanjc@...gle.com>,
<kai.huang@...el.com>, <xiaoyao.li@...el.com>, <kan.liang@...ux.intel.com>,
<xin3.li@...el.com>, <ebiggers@...gle.com>, <xin@...or.com>,
<sohil.mehta@...el.com>, <andrew.cooper3@...rix.com>,
<mario.limonciello@....com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <peternewman@...gle.com>,
<maciej.wieczor-retman@...el.com>, <eranian@...gle.com>,
<Xiaojian.Du@....com>, <gautham.shenoy@....com>
Subject: Re: [PATCH v13 13/27] x86/resctrl: Add the functionality to assign
MBM events
Hi Babu,
On 5/15/25 3:51 PM, Babu Moger wrote:
> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
> can be assigned to RMID, event pair and monitor the bandwidth as long
"RMID, event pairs"? (assuming at this point in new version it will be
obvious what is meant by "event").
> as it is assigned.
>
> Add the functionality to allocate and assign a counter to am RMID, event
"am" -> "an"
> pair in the domain.
>
> If all the counters are in use, kernel will log the error message "Unable
> to allocate counter in domain" in /sys/fs/resctrl/info/last_cmd_status
> when a new assignment is requested. Exit on the first failure when
> assigning counters across all the domains.
>
> Signed-off-by: Babu Moger <babu.moger@....com>
> ---
...
> ---
> fs/resctrl/internal.h | 3 +
> fs/resctrl/monitor.c | 134 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 137 insertions(+)
>
> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
> index 0fae374559ba..ce4fcac91937 100644
> --- a/fs/resctrl/internal.h
> +++ b/fs/resctrl/internal.h
> @@ -377,6 +377,9 @@ bool closid_allocated(unsigned int closid);
>
> int resctrl_find_cleanest_closid(void);
>
> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
> +
> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>
> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
> index 8e403587a02f..d76fd0840946 100644
> --- a/fs/resctrl/monitor.c
> +++ b/fs/resctrl/monitor.c
> @@ -934,3 +934,137 @@ void resctrl_mon_resource_exit(void)
>
> dom_data_exit(r);
> }
> +
> +/*
> + * Configure the counter for the event, RMID pair for the domain. Reset the
> + * non-architectural state to clear all the event counters.
clear *all* the event counters?
"Reset the non-architectural state to clear all the event counters." ->
"Reset the associated non-architectural state."?
Also, please see https://lore.kernel.org/lkml/20250429003359.375508-3-tony.luck@intel.com/
> + */
> +static void resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
> + u32 cntr_id, u32 evt_cfg, bool assign)
> +{
> + struct mbm_state *m;
> +
> + resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
> +
> + m = get_mbm_state(d, closid, rmid, evtid);
> + if (m)
> + memset(m, 0, sizeof(struct mbm_state));
> +}
> +
> +/*
> + * mbm_cntr_get() - Return the cntr_id for the matching evtid and rdtgrp in
> + * cntr_cfg array.
Please prefix parameter names with @ in description to make obvious what is
refered to. Although "cntr_id" is a local variable so may be easier to parse
if cntr_id is replaced with actual "counter ID" term while keeping rest as
actual parameters. That makes cntr_cfg unneeded.
If intending to explain function context then failure return should also
be documented. Even better would be to follow typical style of kernel-doc
(even if not using /** start) and not mix and match so randomly.
> + */
> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
A subtle issue here is only evident from later patches, for example patch #17,
that calls mbm_cntr_get() with a non MBM event ID from __mon_event_count().
If this usage is expected then these utilities needs extra checks to
ensure they are only called with valid MBM event IDs.
> + int cntr_id;
> +
> + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> + if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
> + d->cntr_cfg[cntr_id].evtid == evtid)
> + return cntr_id;
> + }
> +
> + return -ENOENT;
> +}
> +
> +/*
> + * mbm_cntr_alloc() - Return the first free entry in cntr_cfg array.
"Return the first ...array." -> "Initilialize and return ID of a new counter, return -ENOSPC on failure." ?
This is still an awkward use of kernel-doc ... better to be properly formatted.
> + */
> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
> + int cntr_id;
> +
> + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
> + if (!d->cntr_cfg[cntr_id].rdtgrp) {
> + d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
> + d->cntr_cfg[cntr_id].evtid = evtid;
> + return cntr_id;
> + }
> + }
> +
> + return -ENOSPC;
> +}
> +
> +/*
> + * mbm_get_mon_event() - Return the mon_evt entry for the matching evtid.
> + */
> +static struct mon_evt *mbm_get_mon_event(struct rdt_resource *r,
> + enum resctrl_event_id evtid)
> +{
> + struct mon_evt *mevt;
> +
> + list_for_each_entry(mevt, &r->mon.evt_list, list) {
> + if (mevt->evtid == evtid)
> + return mevt;
> + }
With changes from telemetry series this becomes an array lookup.
> +
> + return NULL;
> +}
> +
> +/*
> + * Allocate a fresh counter and configure the event if not assigned already.
> + */
> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
> + struct mon_evt *mevt;
> + int cntr_id;
> +
> + /* No need to allocate a new counter if it is already assigned */
> + cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
> + if (cntr_id >= 0)
> + goto cntr_configure;
> +
> + cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
> + if (cntr_id < 0) {
> + rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
> + d->hdr.id);
> + return cntr_id;
> + }
> +
> +cntr_configure:
> + mevt = mbm_get_mon_event(r, evtid);
> + if (!mevt) {
> + rdt_last_cmd_printf("Invalid event id %d\n", evtid);
Difficult to see at this point but it seems that this is in kernel bug territory since
user space provided text that is translated to event ID and here translated back to
monitor event. This must succeed. Could this be simplified and back-and-forth avoided
by passing the mon_evt instead of event ID?
> + return -EINVAL;
> + }
> +
> + /*
> + * Skip reconfiguration if the event setup is current; otherwise,
> + * update and apply the new configuration to the domain.
> + */
> + if (mevt->evt_cfg != d->cntr_cfg[cntr_id].evt_cfg) {
Lost me. Previous patch silently created mon_event::evt_cfg without initializing it.
Here it is compared and treated as the "source of truth" ... where does its value
come from?
> + d->cntr_cfg[cntr_id].evt_cfg = mevt->evt_cfg;
> + resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
> + cntr_id, mevt->evt_cfg, true);
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * Assign a hardware counter to event @evtid of group @rdtgrp.
> + * Assign counters to all domains if @d is NULL; otherwise, assign the
> + * counter to the specified domain @d.
Can add here what is mentioned in changelog that this exits on first failure
and so highlight that this can have partial assignment when exit on such failure.
> + */
> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
> +{
> + int ret = 0;
> +
> + if (!d) {
> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
> + ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
> + if (ret)
> + return ret;
> + }
> + } else {
> + ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
> + }
> +
> + return ret;
> +}
Reinette
Powered by blists - more mailing lists