lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dfc593f2-b0e6-4946-9dcd-c5f3986c9e6c@amd.com>
Date: Thu, 29 May 2025 11:05:16 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 tony.luck@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com
Cc: james.morse@....com, dave.martin@....com, fenghuay@...dia.com,
 x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
 akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
 ardb@...nel.org, gregkh@...uxfoundation.org, daniel.sneddon@...ux.intel.com,
 jpoimboe@...nel.org, alexandre.chartre@...cle.com,
 pawan.kumar.gupta@...ux.intel.com, thomas.lendacky@....com,
 perry.yuan@....com, seanjc@...gle.com, kai.huang@...el.com,
 xiaoyao.li@...el.com, kan.liang@...ux.intel.com, xin3.li@...el.com,
 ebiggers@...gle.com, xin@...or.com, sohil.mehta@...el.com,
 andrew.cooper3@...rix.com, mario.limonciello@....com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 peternewman@...gle.com, maciej.wieczor-retman@...el.com, eranian@...gle.com,
 Xiaojian.Du@....com, gautham.shenoy@....com
Subject: Re: [PATCH v13 13/27] x86/resctrl: Add the functionality to assign
 MBM events

Hi Reinette,

On 5/22/25 17:41, Reinette Chatre wrote:
> Hi Babu,
> 
> On 5/15/25 3:51 PM, Babu Moger wrote:
>> The mbm_cntr_assign mode offers "num_mbm_cntrs" number of counters that
>> can be assigned to RMID, event pair and monitor the bandwidth as long
> 
> "RMID, event pairs"? (assuming at this point in new version it will be
> obvious what is meant by "event").

Sure.

> 
>> as it is assigned.
>>
>> Add the functionality to allocate and assign a counter to am RMID, event
> 
> "am" -> "an"
> 

sure.

>> pair in the domain.
>>
>> If all the counters are in use, kernel will log the error message "Unable
>> to allocate counter in domain" in /sys/fs/resctrl/info/last_cmd_status
>> when a new assignment is requested. Exit on the first failure when
>> assigning counters across all the domains.
>>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
> 
> ...
> 
>> ---
>>  fs/resctrl/internal.h |   3 +
>>  fs/resctrl/monitor.c  | 134 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 137 insertions(+)
>>
>> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
>> index 0fae374559ba..ce4fcac91937 100644
>> --- a/fs/resctrl/internal.h
>> +++ b/fs/resctrl/internal.h
>> @@ -377,6 +377,9 @@ bool closid_allocated(unsigned int closid);
>>  
>>  int resctrl_find_cleanest_closid(void);
>>  
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
>> +
>>  #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
>>  int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
>>  
>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
>> index 8e403587a02f..d76fd0840946 100644
>> --- a/fs/resctrl/monitor.c
>> +++ b/fs/resctrl/monitor.c
>> @@ -934,3 +934,137 @@ void resctrl_mon_resource_exit(void)
>>  
>>  	dom_data_exit(r);
>>  }
>> +
>> +/*
>> + * Configure the counter for the event, RMID pair for the domain. Reset the
>> + * non-architectural state to clear all the event counters.
> 
> clear *all* the event counters?
> 
> "Reset the non-architectural state to clear all the event counters." ->
> "Reset the associated non-architectural state."?

ok.

> 
> Also, please see https://lore.kernel.org/lkml/20250429003359.375508-3-tony.luck@intel.com/

Yes. Sure.

> 
>> + */
>> +static void resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +				enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> +				u32 cntr_id, u32 evt_cfg, bool assign)
>> +{
>> +	struct mbm_state *m;
>> +
>> +	resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, evt_cfg, assign);
>> +
>> +	m = get_mbm_state(d, closid, rmid, evtid);
>> +	if (m)
>> +		memset(m, 0, sizeof(struct mbm_state));
>> +}
>> +
>> +/*
>> + * mbm_cntr_get() - Return the cntr_id for the matching evtid and rdtgrp in
>> + *		    cntr_cfg array.
> 
> Please prefix parameter names with @ in description to make obvious what is
> refered to. Although "cntr_id" is a local variable so may be easier to parse
> if cntr_id is replaced with actual "counter ID" term while keeping rest as
> actual parameters. That makes cntr_cfg unneeded.

Sure.


> If intending to explain function context then failure return should also
> be documented. Even better would be to follow typical style of kernel-doc
> (even if not using /** start) and not mix and match so randomly.

Sure.

> 
>> + */
>> +static int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
> 
> A subtle issue here is only evident from later patches, for example patch #17,
> that calls mbm_cntr_get() with a non MBM event ID from __mon_event_count().
> 
> If this usage is expected then these utilities needs extra checks to
> ensure they are only called with valid MBM event IDs.

Sure. Will add the check resctrl_is_mbm_event().

> 
>> +	int cntr_id;
>> +
>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> +		if (d->cntr_cfg[cntr_id].rdtgrp == rdtgrp &&
>> +		    d->cntr_cfg[cntr_id].evtid == evtid)
>> +			return cntr_id;
>> +	}
>> +
>> +	return -ENOENT;
>> +}
>> +
>> +/*
>> + * mbm_cntr_alloc() - Return the first free entry in cntr_cfg array.
> 
> "Return the first ...array."  -> "Initilialize and return ID of a new counter, return -ENOSPC on failure." ?
> This is still an awkward use of kernel-doc ... better to be properly formatted.

Sure.

> 
>> + */
>> +static int mbm_cntr_alloc(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			  struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> +	int cntr_id;
>> +
>> +	for (cntr_id = 0; cntr_id < r->mon.num_mbm_cntrs; cntr_id++) {
>> +		if (!d->cntr_cfg[cntr_id].rdtgrp) {
>> +			d->cntr_cfg[cntr_id].rdtgrp = rdtgrp;
>> +			d->cntr_cfg[cntr_id].evtid = evtid;
>> +			return cntr_id;
>> +		}
>> +	}
>> +
>> +	return -ENOSPC;
>> +}
>> +
>> +/*
>> + * mbm_get_mon_event() - Return the mon_evt entry for the matching evtid.
>> + */
>> +static struct mon_evt *mbm_get_mon_event(struct rdt_resource *r,
>> +					 enum resctrl_event_id evtid)
>> +{
>> +	struct mon_evt *mevt;
>> +
>> +	list_for_each_entry(mevt, &r->mon.evt_list, list) {
>> +		if (mevt->evtid == evtid)
>> +			return mevt;
>> +	}
> 
> With changes from  telemetry series this becomes an array lookup.

Sure. Will look into this.

> 
>> +
>> +	return NULL;
>> +}
>> +
>> +/*
>> + * Allocate a fresh counter and configure the event if not assigned already.
>> + */
>> +static int resctrl_alloc_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +				     struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> +	struct mon_evt *mevt;
>> +	int cntr_id;
>> +
>> +	/* No need to allocate a new counter if it is already assigned */
>> +	cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
>> +	if (cntr_id >= 0)
>> +		goto cntr_configure;
>> +
>> +	cntr_id = mbm_cntr_alloc(r, d, rdtgrp, evtid);
>> +	if (cntr_id <  0) {
>> +		rdt_last_cmd_printf("Unable to allocate counter in domain %d\n",
>> +				    d->hdr.id);
>> +		return cntr_id;
>> +	}
>> +
>> +cntr_configure:
>> +	mevt = mbm_get_mon_event(r, evtid);
>> +	if (!mevt) {
>> +		rdt_last_cmd_printf("Invalid event id %d\n", evtid);
> 
> Difficult to see at this point but it seems that this is in kernel bug territory since
> user space provided text that is translated to event ID and here translated back to
> monitor event. This must succeed. Could this be simplified and back-and-forth avoided
> by passing the mon_evt instead of event ID?

We can do that.

> 
>> +		return -EINVAL;
>> +	}
> 
> 
> 
>> +
>> +	/*
>> +	 * Skip reconfiguration if the event setup is current; otherwise,
>> +	 * update and apply the new configuration to the domain.
>> +	 */
>> +	if (mevt->evt_cfg != d->cntr_cfg[cntr_id].evt_cfg) {
> 
> Lost me. Previous patch silently created mon_event::evt_cfg without initializing it.
> Here it is compared and treated as the "source of truth" ... where does its value
> come from?

Yes. That is correct. Will have to initialize evt_cfg when it is first
introduced. Will do.


> 
>> +		d->cntr_cfg[cntr_id].evt_cfg = mevt->evt_cfg;
>> +		resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, rdtgrp->closid,
>> +				    cntr_id, mevt->evt_cfg, true);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Assign a hardware counter to event @evtid of group @rdtgrp.
>> + * Assign counters to all domains if @d is NULL; otherwise, assign the
>> + * counter to the specified domain @d.
> 
> Can add here what is mentioned in changelog that this exits on first failure
> and so highlight that this can have partial assignment when exit on such failure.

Sure.

> 
>> + */
>> +int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
>> +			      struct rdtgroup *rdtgrp, enum resctrl_event_id evtid)
>> +{
>> +	int ret = 0;
>> +
>> +	if (!d) {
>> +		list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +			ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
>> +			if (ret)
>> +				return ret;
>> +		}
>> +	} else {
>> +		ret = resctrl_alloc_config_cntr(r, d, rdtgrp, evtid);
>> +	}
>> +
>> +	return ret;
>> +}
> 
> Reinette
> 

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ