[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dfc593f2-b0e6-4946-9dcd-c5f3986c9e6c@amd.com>
Date: Thu, 29 May 2025 11:05:16 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
tony.luck@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com
Cc: james.morse@....com, dave.martin@....com, fenghuay@...dia.com,
x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
ardb@...nel.org, gregkh@...uxfoundation.org, daniel.sneddon@...ux.intel.com,
jpoimboe@...nel.org, alexandre.chartre@...cle.com,
pawan.kumar.gupta@...ux.intel.com, thomas.lendacky@....com,
perry.yuan@....com, seanjc@...gle.com, kai.huang@...el.com,
xiaoyao.li@...el.com, kan.liang@...ux.intel.com, xin3.li@...el.com,
ebiggers@...gle.com, xin@...or.com, sohil.mehta@...el.com,
andrew.cooper3@...rix.com, mario.limonciello@....com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
peternewman@...gle.com, maciej.wieczor-retman@...el.com, eranian@...gle.com,
Xiaojian.Du@....com, gautham.shenoy@....com
Subject: Re: [PATCH v13 13/27] x86/resctrl: Add the functionality to assign
MBM events
Hi Reinette,
On 5/22/25 17:41, Reinette Chatre wrote:
> Hi Babu,
>
> On 5/15/25 3:51 PM, Babu Moger wrote:
>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>> can be assigned to RMID, event pair and monitor the bandwidth as long
>
> "RMID, event pairs"? (assuming at this point in new version it will be
> obvious what is meant by "event").
Sure.
>
>> as it is assigned.
>>
>> Add the functionality to allocate and assign a counter to am RMID, event
>
> "am" -> "an"
>
sure.
>> pair in the domain.
>>
>> If all the counters are in use, kernel will log the error message "Unable
>> to allocate counter in domain" in /sys/fs/resctrl/info/last_cmd_status
>> when a new assignment is requested. Exit on the first failure when
>> assigning counters across all the domains.
>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>
> ...
>
>> ---
>> fs/resctrl/internal.h | 3 +
>> fs/resctrl/monitor.c | 134 ++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 137 insertions(+)
>>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 0fae374559ba..ce4fcac91937 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -377,6 +377,9 @@ bool closid_allocated(unsigned int closid);
>>
>> int resctrl_find_cleanest_closid(void);
>>
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
>> +
>> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>>
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 8e403587a02f..d76fd0840946 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -934,3 +934,137 @@ void resctrl_mon_resource_exit(void)
>>
>> dom_data_exit(r);
>> }
>> +
>> +/*
>> + * Configure the counter for the event, RMID pair for the domain. Reset the
>> + * non-architectural state to clear all the event counters.
>
> clear *all* the event counters?
>
> "Reset the non-architectural state to clear all the event counters." ->
> "Reset the associated non-architectural state."?
ok.
>
> Also, please see https://lore.kernel.org/lkml/20250429003359.375508-3-tony.luck@intel.com/
Yes. Sure.
>
>> + */
>> +static void resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> + u32 cntr_id, u32 evt_cfg, bool assign)
>> +{
>> + struct mbm_state *m;
>> +
>> + resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
>> +
>> + m = get_mbm_state(d, closid, rmid, evtid);
>> + if (m)
>> + memset(m, 0, sizeof(struct mbm_state));
>> +}
>> +
>> +/*
>> + * mbm_cntr_get() - Return the cntr_id for the matching evtid and rdtgrp in
>> + * cntr_cfg array.
>
> Please prefix parameter names with @ in description to make obvious what is
> refered to. Although "cntr_id" is a local variable so may be easier to parse
> if cntr_id is replaced with actual "counter ID" term while keeping rest as
> actual parameters. That makes cntr_cfg unneeded.
Sure.
> If intending to explain function context then failure return should also
> be documented. Even better would be to follow typical style of kernel-doc
> (even if not using /** start) and not mix and match so randomly.
Sure.
>
>> + */
>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>
> A subtle issue here is only evident from later patches, for example patch #17,
> that calls mbm_cntr_get() with a non MBM event ID from __mon_event_count().
>
> If this usage is expected then these utilities needs extra checks to
> ensure they are only called with valid MBM event IDs.
Sure. Will add the check resctrl_is_mbm_event().
>
>> + int cntr_id;
>> +
>> + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> + if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
>> + d->cntr_cfg[cntr_id].evtid == evtid)
>> + return cntr_id;
>> + }
>> +
>> + return -ENOENT;
>> +}
>> +
>> +/*
>> + * mbm_cntr_alloc() - Return the first free entry in cntr_cfg array.
>
> "Return the first ...array." -> "Initilialize and return ID of a new counter, return -ENOSPC on failure." ?
> This is still an awkward use of kernel-doc ... better to be properly formatted.
Sure.
>
>> + */
>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> + int cntr_id;
>> +
>> + for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> + if (!d->cntr_cfg[cntr_id].rdtgrp) {
>> + d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
>> + d->cntr_cfg[cntr_id].evtid = evtid;
>> + return cntr_id;
>> + }
>> + }
>> +
>> + return -ENOSPC;
>> +}
>> +
>> +/*
>> + * mbm_get_mon_event() - Return the mon_evt entry for the matching evtid.
>> + */
>> +static struct mon_evt *mbm_get_mon_event(struct rdt_resource *r,
>> + enum resctrl_event_id evtid)
>> +{
>> + struct mon_evt *mevt;
>> +
>> + list_for_each_entry(mevt, &r->mon.evt_list, list) {
>> + if (mevt->evtid == evtid)
>> + return mevt;
>> + }
>
> With changes from telemetry series this becomes an array lookup.
Sure. Will look into this.
>
>> +
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Allocate a fresh counter and configure the event if not assigned already.
>> + */
>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> + struct mon_evt *mevt;
>> + int cntr_id;
>> +
>> + /* No need to allocate a new counter if it is already assigned */
>> + cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>> + if (cntr_id >= 0)
>> + goto cntr_configure;
>> +
>> + cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
>> + if (cntr_id < 0) {
>> + rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
>> + d->hdr.id);
>> + return cntr_id;
>> + }
>> +
>> +cntr_configure:
>> + mevt = mbm_get_mon_event(r, evtid);
>> + if (!mevt) {
>> + rdt_last_cmd_printf("Invalid event id %d\n", evtid);
>
> Difficult to see at this point but it seems that this is in kernel bug territory since
> user space provided text that is translated to event ID and here translated back to
> monitor event. This must succeed. Could this be simplified and back-and-forth avoided
> by passing the mon_evt instead of event ID?
We can do that.
>
>> + return -EINVAL;
>> + }
>
>
>
>> +
>> + /*
>> + * Skip reconfiguration if the event setup is current; otherwise,
>> + * update and apply the new configuration to the domain.
>> + */
>> + if (mevt->evt_cfg != d->cntr_cfg[cntr_id].evt_cfg) {
>
> Lost me. Previous patch silently created mon_event::evt_cfg without initializing it.
> Here it is compared and treated as the "source of truth" ... where does its value
> come from?
Yes. That is correct. Will have to initialize evt_cfg when it is first
introduced. Will do.
>
>> + d->cntr_cfg[cntr_id].evt_cfg = mevt->evt_cfg;
>> + resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
>> + cntr_id, mevt->evt_cfg, true);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> + * Assign a hardware counter to event @evtid of group @rdtgrp.
>> + * Assign counters to all domains if @d is NULL; otherwise, assign the
>> + * counter to the specified domain @d.
>
> Can add here what is mentioned in changelog that this exits on first failure
> and so highlight that this can have partial assignment when exit on such failure.
Sure.
>
>> + */
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> + int ret = 0;
>> +
>> + if (!d) {
>> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> + ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
>> + if (ret)
>> + return ret;
>> + }
>> + } else {
>> + ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
>> + }
>> +
>> + return ret;
>> +}
>
> Reinette
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists