lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58d214b6-9bd0-42dc-a00d-cc80bccc8fdd@amd.com>
Date: Wed, 25 Jun 2025 20:31:49 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
 Babu Moger <babu.moger@....com>, corbet@....net, tony.luck@...el.com,
 Dave.Martin@....com, james.morse@....com, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, akpm@...ux-foundation.org,
 rostedt@...dmis.org, paulmck@...nel.org, thuth@...hat.com, ardb@...nel.org,
 gregkh@...uxfoundation.org, seanjc@...gle.com, thomas.lendacky@....com,
 pawan.kumar.gupta@...ux.intel.com, manali.shukla@....com,
 perry.yuan@....com, kai.huang@...el.com, peterz@...radead.org,
 xiaoyao.li@...el.com, kan.liang@...ux.intel.com, mario.limonciello@....com,
 xin3.li@...el.com, gautham.shenoy@....com, xin@...or.com,
 chang.seok.bae@...el.com, fenghuay@...dia.com, peternewman@...gle.com,
 maciej.wieczor-retman@...el.com, eranian@...gle.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v14 13/32] fs/resctrl: Introduce mbm_cntr_cfg to track
 assignable counters per domain

Hi Reinette,

On 6/24/2025 6:31 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/13/25 2:04 PM, Babu Moger wrote:
>> The "mbm_event" mode allows users to assign a hardware counter ID to an
> 
> "hardware counter ID" -> "hardware counter" (I'll stop pointing these out)

Sure.

> 
>> RMID, event pair and monitor bandwidth usage as long as it is assigned.
>> The hardware continues to track the assigned counter until it is
>> explicitly unassigned by the user. Counters are assigned/unassigned at
>> monitoring domain level.
>>
>> Manage a monitoring domain's hardware counters using a per monitoring
>> domain array of struct mbm_cntr_cfg that is indexed by the hardware
>> counter ID. A hardware counter's configuration contains the MBM event
>> ID and points to the monitoring group that it is assigned to, with a
>> NULL pointer meaning that the hardware counter is available for assignment.
>>
>> There is no direct way to determine which hardware counters are assigned
>> to a particular monitoring group. Check every entry of every hardware
>> counter configuration array in every monitoring domain to query which
>> MBM events of a monitoring group is tracked by hardware. Such queries are
>> acceptable because of a very small number of assignable counters (32
>> to 64).
>>
>> Suggested-by: Peter Newman <peternewman@...gle.com>
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
> ...
> 
>> ---
>>   fs/resctrl/rdtgroup.c   |  8 ++++++++
>>   include/linux/resctrl.h | 19 +++++++++++++++++++
>>   2 files changed, 27 insertions(+)
>>
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 967e4df62a19..90b52593ef29 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -4084,6 +4084,7 @@ static void rdtgroup_setup_default(void)
>>   
>>   static void domain_destroy_mon_state(struct rdt_mon_domain *d)
>>   {
>> +	kfree(d->cntr_cfg);
>>   	bitmap_free(d->rmid_busy_llc);
>>   	for (int i = 0; i < QOS_NUM_L3_MBM_EVENTS; i++) {
>>   		kfree(d->mbm_states[i]);
>> @@ -4167,6 +4168,13 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
>>   			goto cleanup;
>>   	}
>>   
>> +	if (resctrl_is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
>> +		tsize = sizeof(*d->cntr_cfg);
>> +		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
>> +		if (!d->cntr_cfg)
>> +			goto cleanup;
>> +	}
>> +
> 
> Please see my earlier comment https://lore.kernel.org/lkml/b761e6ec-a874-4d06-8437-a3a717a91abb@intel.com/
> Before this addition the "cleanup" goto label can only be called when
> (a) idx is guaranteed to be initialized and (b) d->mbm_states[idx] == NULL.
> Using that goto label in snippet above cannot guarantee either.

Yes. Tony took care of this.

cleanup:
         bitmap_free(d->rmid_busy_llc);
         for_each_mbm_idx(idx) {
                 kfree(d->mbm_states[idx]);
                 d->mbm_states[idx] = NULL;
         }

         return -ENOMEM;
}

> 
>>   	return 0;
>>   cleanup:
>>   	bitmap_free(d->rmid_busy_llc);
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index f078ef24a8ad..468a4ebabc64 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -156,6 +156,22 @@ struct rdt_ctrl_domain {
>>   	u32				*mbps_val;
>>   };
>>   
>> +/**
>> + * struct mbm_cntr_cfg - Assignable counter configuration.
>> + * @evtid:		MBM event to which the counter is assigned. Only valid
>> + *			if @rdtgroup is not NULL.
>> + * @evt_cfg:		Event configuration created using the READS_TO_LOCAL_MEM,
>> + *			READS_TO_REMOTE_MEM, etc. bits that represent the memory
>> + *			transactions being counted.
>> + * @rdtgrp:		resctrl group assigned to the counter. NULL if the
>> + *			counter is free.
>> + */
>> +struct mbm_cntr_cfg {
>> +	enum resctrl_event_id	evtid;
>> +	u32			evt_cfg;
> 
> It is not clear to me why the event configuration needs to be duplicated
> between mbm_cntr_cfg::evt_cfg and mon_evt::evt_cfg (done in patch #16).
> I think there should be only one "source of truth" and mon_evt::evt_cfg
> seems most appropriate since then it can be shared with BMEC.
> 
> It also seems unnecessary to make so many copies of the event configuration
> if it can just be determined from the event ID.
> 
> Looking ahead at how this is used, for example in event_filter_write()
> introduced in patch #25:
> 	ret = resctrl_process_configs(buf, &evt_cfg);
> 	if (!ret && mevt->evt_cfg != evt_cfg) {
> 		mevt->evt_cfg = evt_cfg;
> 		resctrl_assign_cntr_allrdtgrp(r, mevt);
> 	}
> 
> After user provides new event configuration the mon_evt::evt_cfg is
> updated. Since there is this initial check to determine if counters need
> to be updated I think it is unnecessary to have a second copy of mbm_cntr_cfg::evt_cfg
> that needs to be checked again. The functions called by resctrl_assign_cntr_allrdtgrp(r, mevt)
> should just update the counters without any additional comparison.
> 
> For example, rdtgroup_assign_cntr() can be simplified to:
> 	rdtgroup_assign_cntr() {
> 		...
> 		list_for_each_entry(d, &r->mon_domains, hdr.list) {
> 			cntr_id = mbm_cntr_get(r, d, rdtgrp, mevt->evtid);
> 			if (cntr_id >= 0)
> 				resctrl_arch_config_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid,
> 							 rdtgrp->closid, cntr_id, true);
> 		}
> 	}
> 
> 

Actually, this interaction works as intended.

It serves as an optimization for cases where the user repeatedly tries 
to assign the same event to a group. Since we have no way of knowing 
whether the event is up-to-date, this mechanism helps us avoid 
unnecessary MSR writes.

For example:
mbm_L3_assignments_write() → resctrl_assign_cntr_event() → 
resctrl_alloc_config_cntr() → resctrl_config_cntr() → 
resctrl_arch_config_cntr()


resctrl_alloc_config_cntr()

{
..

/*
  * Skip reconfiguration if the event setup is current; otherwise,
  * update and apply the new configuration to the domain.
  */
  if (mevt->evt_cfg != d->cntr_cfg[cntr_id].evt_cfg) {
      d->cntr_cfg[cntr_id].evt_cfg = mevt->evt_cfg;
      resctrl_config_cntr(r, d, mevt->evtid, rdtgrp->mon.rmid,
                                 rdtgrp->closid, cntr_id, true);
    }
}


Thanks
Babu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ