[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44c82df0-30e7-44d1-8759-c8f3df68e538@amd.com>
Date: Mon, 14 Apr 2025 12:48:58 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, tony.luck@...el.com,
peternewman@...gle.com
Cc: corbet@....net, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
paulmck@...nel.org, akpm@...ux-foundation.org, thuth@...hat.com,
rostedt@...dmis.org, ardb@...nel.org, gregkh@...uxfoundation.org,
daniel.sneddon@...ux.intel.com, jpoimboe@...nel.org,
alexandre.chartre@...cle.com, pawan.kumar.gupta@...ux.intel.com,
thomas.lendacky@....com, perry.yuan@....com, seanjc@...gle.com,
kai.huang@...el.com, xiaoyao.li@...el.com, kan.liang@...ux.intel.com,
xin3.li@...el.com, ebiggers@...gle.com, xin@...or.com,
sohil.mehta@...el.com, andrew.cooper3@...rix.com, mario.limonciello@....com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
maciej.wieczor-retman@...el.com, eranian@...gle.com
Subject: Re: [PATCH v12 03/26] x86/cpufeatures: Add support for Assignable
Bandwidth Monitoring Counters (ABMC)
Hi Reinette,
On 4/11/25 15:52, Reinette Chatre wrote:
> Hi Babu,
>
> On 4/3/25 5:18 PM, Babu Moger wrote:
>> Users can create as many monitor groups as RMIDs supported by the hardware.
>> However, bandwidth monitoring feature on AMD system only guarantees that
>> RMIDs currently assigned to a processor will be tracked by hardware. The
>> counters of any other RMIDs which are no longer being tracked will be reset
>> to zero. The MBM event counters return "Unavailable" for the RMIDs that are
>> not tracked by hardware. So, there can be only limited number of groups
>> that can give guaranteed monitoring numbers. With ever changing
>> configurations there is no way to definitely know which of these groups are
>> being tracked for certain point of time. Users do not have the option to
>> monitor a group or set of groups for certain period of time without
>> worrying about RMID being reset in between.
>>
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as it is
>> assigned. The assigned RMID will be tracked by the hardware until the user
>> unassigns it manually. There is no need to worry about counters being reset
>> during this period. Additionally, the user can specify a bitmask
>> identifying the specific bandwidth types from the given source to track
>> with the counter.
>>
>> Without ABMC enabled, monitoring will work in current mode without
>> assignment option.
>>
>> Linux resctrl subsystem provides the interface to count maximum of two
>> memory bandwidth events per group, from a combination of available total
>> and local events. Keeping the current interface, users can enable a maximum
>> of 2 ABMC counters per group. User will also have the option to enable only
>> one counter to the group. If the system runs out of assignable ABMC
>> counters, kernel will display an error. Users need to disable an already
>> enabled counter to make space for new assignments.
>
> The above paragraph sounds like it is still talking about the original
> global assignment of counters.
Ok. Sure. Will update it.
>
>>
>> The feature can be detected via CPUID_Fn80000020_EBX_x00 bit 5.
>> Bits Description
>> 5 ABMC (Assignable Bandwidth Monitoring Counters)
>>
>> The feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@....com>
>> ---
>>
>> Note: Checkpatch checks/warnings are ignored to maintain coding style.
>>
>> v12: Removed the dependancy on X86_FEATURE_BMEC.
>
> Considering this removal it is not clear to me how the BMEC and ABMC features
> are managed on a platform. Since this dependency existed I assume platforms
> that support both ABMC and BMEC exist and after previous discussion [1]
> I expected to see that BMEC support will be disabled when ABMC is detected
> but I do not see this done in this series. From what I can tell, looking at
> patch "x86/resctrl: Detect Assignable Bandwidth Monitoring feature details"
> BMEC and ABMC are both detected and enabled while I do not see any
> interactions handled. For example, a user modifying the BMEC appears
> to have no impact on existing ABMC assigned counters. Could you please clarify
> how event configuration works on platforms that support both ABMC and BMEC?
They are mutually exclusive. If ABMC is enabled then BMEC should not work.
I missed to handle it. Also, I was not very clear at that on how to
handle that.
Here is my proposal to handle this case. This can be separate patch.
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index d10cf1e5b914..772f2f77faee 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1370,7 +1370,7 @@ static int rdt_mon_features_show(struct
kernfs_open_file *of,
list_for_each_entry(mevt, &r->mon.evt_list, list) {
seq_printf(seq, "%s\n", mevt->name);
- if (mevt->configurable)
+ if (mevt->configurable &&
!resctrl_arch_mbm_cntr_assign_enabled(r))
seq_printf(seq, "%s_config\n", mevt->name);
}
@@ -1846,6 +1846,11 @@ static int mbm_config_show(struct seq_file *s,
struct rdt_resource *r, u32 evtid
cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
+ if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+ rdt_last_cmd_puts("Event configuration(BMEC) not supported
with mbm_cntr_assign mode\n");
+ return -EINVAL;
+ }
+
list_for_each_entry(dom, &r->mon_domains, hdr.list) {
if (sep)
seq_puts(s, ";");
@@ -1865,21 +1870,24 @@ static int mbm_config_show(struct seq_file *s,
struct rdt_resource *r, u32 evtid
static int mbm_total_bytes_config_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
+ int ret;
struct rdt_resource *r = of->kn->parent->priv;
- mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
+ ret = mbm_config_show(seq, r, QOS_L3_MBM_TOTAL_EVENT_ID);
- return 0;
+ return ret;
}
static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
+ int ret;
+
struct rdt_resource *r = of->kn->parent->priv;
- mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
+ ret = mbm_config_show(seq, r, QOS_L3_MBM_LOCAL_EVENT_ID);
- return 0;
+ return ret;
}
static void mbm_config_write_domain(struct rdt_resource *r,
@@ -1932,6 +1940,11 @@ static int mon_config_write(struct rdt_resource *r,
char *tok, u32 evtid)
/* Walking r->domains, ensure it can't race with cpuhp */
lockdep_assert_cpus_held();
+ if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+ rdt_last_cmd_puts("Event configuration(BMEC) not supported
with mbm_cntr_assign mode\n");
+ return -EINVAL;
+ }
+
next:
if (!tok || tok[0] == '\0')
return 0;
>
> Reinette
>
> [1] https://lore.kernel.org/all/4b66ea1c-2f76-4218-a67b-2232b2be6990@amd.com/
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists