lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <226388e2-deb7-4a1a-a648-eb1e0cfb18e2@amd.com>
Date: Fri, 21 Feb 2025 13:44:19 -0600
From: "Moger, Babu" <bmoger@....com>
To: James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>,
 corbet@....net, reinette.chatre@...el.com, tglx@...utronix.de,
 mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
 tony.luck@...el.com, peternewman@...gle.com
Cc: fenghua.yu@...el.com, x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
 akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
 xiongwei.song@...driver.com, pawan.kumar.gupta@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, jpoimboe@...nel.org, perry.yuan@....com,
 sandipan.das@....com, kai.huang@...el.com, xiaoyao.li@...el.com,
 seanjc@...gle.com, xin3.li@...el.com, andrew.cooper3@...rix.com,
 ebiggers@...gle.com, mario.limonciello@....com, tan.shaopeng@...itsu.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 maciej.wieczor-retman@...el.com, eranian@...gle.com
Subject: Re: [PATCH v11 07/23] x86/resctrl: Introduce the interface to display
 monitor mode

Hi James,

On 2/21/2025 12:06 PM, James Morse wrote:
> Hi Babu,
> 
> On 22/01/2025 20:20, Babu Moger wrote:
>> Introduce the interface file "mbm_assign_mode" to list monitor modes
>> supported.
>>
>> The "mbm_cntr_assign" mode provides the option to assign a counter to
>> an RMID, event pair and monitor the bandwidth as long as it is assigned.
>>
>> On AMD systems "mbm_cntr_assign" is backed by the ABMC (Assignable
>> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
>>
>> The "default" mode is the existing monitoring mode that works without the
>> explicit counter assignment, instead relying on dynamic counter assignment
>> by hardware that may result in hardware not dedicating a counter resulting
>> in monitoring data reads returning "Unavailable".
>>
>> Provide an interface to display the monitor mode on the system.
>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> [mbm_cntr_assign]
>> default
> 
>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>> index fb90f08e564e..b5defc5bce0e 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -257,6 +257,32 @@ with the following files:
>>   	    # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>   	    0=0x30;1=0x30;3=0x15;4=0x15
>>   
>> +"mbm_assign_mode":
>> +	Reports the list of monitoring modes supported. The enclosed brackets
>> +	indicate which mode is enabled.
>> +	::
>> +
>> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> +	  [mbm_cntr_assign]
>> +	  default
>> +
>> +	"mbm_cntr_assign":
>> +
>> +	In mbm_cntr_assign, monitoring event can only accumulate data while
>> +	it is backed by a hardware counter. The user-space is able to specify
>> +	which of the events in CTRL_MON or MON groups should have a counter
>> +	assigned using the "mbm_assign_control" file. The number of counters
>> +	available is described in the "num_mbm_cntrs" file. Changing the mode
>> +	may cause all counters on a resource to reset.
> 
>> +	"default":
>> +
>> +	In default mode, resctrl assumes there is a hardware counter for each
>> +	event within every CTRL_MON and MON group. On AMD platforms, it is
>> +	recommended to use mbm_cntr_assign mode if supported, because reading
>> +	"mbm_total_bytes" or "mbm_local_bytes" will report 'Unavailable' if
>> +	there is no counter associated with that event.
> 
> But if you read a value instead of "Unavailable", that doesn't mean the value is correct.
> For two reads that succeed, the counter may have been reset in the middle.
> 
> I'm suggesting something like:
> | it is recommended to use mbm_cntr_assign mode if supported, to avoid counters
> | being re-allocated by hardware. This can cause a misleading value to be read,
> | or if no counter is associated with that event "Unavailable".
> 
Looks good with slight modification.

On AMD platforms, it is recommended to use the mbm_cntr_assign mode, if 
supported, to prevent the hardware from resetting counters between 
reads. This can result in misleading values or display "Unavailable" if 
no counter is assigned to the event.


> 
>>   "max_threshold_occupancy":
>>   		Read/write file provides the largest value (in
>>   		bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index f91fe605766f..3880480a41d2 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -854,6 +854,30 @@ static int rdtgroup_rmid_show(struct kernfs_open_file *of,
>>   	return ret;
>>   }
>>   
>> +static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
>> +					struct seq_file *s, void *v)
>> +{
>> +	struct rdt_resource *r = of->kn->parent->priv;
>> +
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	if (r->mon.mbm_cntr_assignable) {
>> +		if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
>> +			seq_puts(s, "[mbm_cntr_assign]\n");
>> +			seq_puts(s, "default\n");
>> +		} else {
>> +			seq_puts(s, "mbm_cntr_assign\n");
>> +			seq_puts(s, "[default]\n");
>> +		}
> 
> What do you think to an architecture being able to opt-out of this flexibility?
> 
> If there aren't enough counters I can expose what the hardware has through this interface
> - but if user-space turns it off ... then what?
> 
> For MPAM this would need to be some best-effort software allocation strategy that I'd
> rather not write - its not a problem that can be solved, and any value that is reported is
> likely to be wrong. For ABMC platforms, existing stable kernels expose a value, so being
> able to preserve the existing behaviour makes sense. MPAM doesn't have this problem.
> 
> Something like this:
> ----------%<----------
> @@ -861,16 +861,21 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open
> _file *of,
>                                          struct seq_file *s, void *v)
>   {
>          struct rdt_resource *r = of->kn->parent->priv;
> +       bool enabled = resctrl_arch_mbm_cntr_assign_enabled(r);
> 
>          mutex_lock(&rdtgroup_mutex);
> 
>          if (r->mon.mbm_cntr_assignable) {
> -               if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
> +               if (enabled)
>                          seq_puts(s, "[mbm_cntr_assign]\n");
> -                       seq_puts(s, "default\n");
> -               } else {
> -                       seq_puts(s, "mbm_cntr_assign\n");
> +               else
>                          seq_puts(s, "[default]\n");
> +
> +               if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED) {
> +                       if (enabled)
> +                               seq_puts(s, "default\n");
> +                       else
> +                               seq_puts(s, "mbm_cntr_assign\n");
>                  }
>          } else {
>                  seq_puts(s, "[default]\n");
> ----------%<----------
> 
> x86 wouldn't define CONFIG_RESCTRL_ASSIGN_FIXED, arm64 would, meaning for MPAM the file
> would be either:
>   | [default]
> or
> | [mbm_cntr_assign]
> 

Looks good mostly.

resctrl_arch_mbm_cntr_assign_enabled(r) needs to be called with the lock.

Also, there is no reference of CONFIG_RESCTRL_ASSIGN_FIXED in this 
series. Hope that is not an issue.

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 3880480a41d2..2907bb7bfa56 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -858,16 +858,22 @@ static int resctrl_mbm_assign_mode_show(struct 
kernfs_open_file *of,
                                         struct seq_file *s, void *v)
  {
         struct rdt_resource *r = of->kn->parent->priv;
+       bool enabled;

         mutex_lock(&rdtgroup_mutex);
+       enabled = resctrl_arch_mbm_cntr_assign_enabled(r);

         if (r->mon.mbm_cntr_assignable) {
-               if (resctrl_arch_mbm_cntr_assign_enabled(r)) {
+               if (enabled)
                         seq_puts(s, "[mbm_cntr_assign]\n");
-                       seq_puts(s, "default\n");
-               } else {
-                       seq_puts(s, "mbm_cntr_assign\n");
+               else
                         seq_puts(s, "[default]\n");
+
+               if (!IS_ENABLED(CONFIG_RESCTRL_ASSIGN_FIXED)) {
+                       if (enabled)
+                               seq_puts(s, "default\n");
+                       else
+                               seq_puts(s, "mbm_cntr_assign\n");
                 }
         } else {
                 seq_puts(s, "[default]\n");




> 
>> +	} else {
>> +		seq_puts(s, "[default]\n");
>> +	}
>> +
>> +	mutex_unlock(&rdtgroup_mutex);
>> +
>> +	return 0;
>> +}
> 
> 
> Thanks,
> 
> James
> 

Thanks
Babu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ