linux-kernel - Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f4845fee-3f91-4e78-a186-a7bdc58f7873@intel.com>
Date: Wed, 13 Nov 2024 14:25:53 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghua.yu@...el.com>, "Peter
 Newman" <peternewman@...gle.com>, Jonathan Corbet <corbet@....net>, Shuah
 Khan <skhan@...uxfoundation.org>, <x86@...nel.org>
CC: James Morse <james.morse@....com>, Jamie Iles <quic_jiles@...cinc.com>,
	Babu Moger <babu.moger@....com>, Randy Dunlap <rdunlap@...radead.org>,
	"Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>,
	<linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
	<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 3/7] x86/resctrl: Refactor mbm_update()

Hi Tony,

On 10/29/24 10:28 AM, Tony Luck wrote:
> Computing memory bandwidth for all enabled events resulted in
> identical code blocks for total and local bandwidth in mbm_update().
> 
> Refactor with a helper function to eliminate code duplication.
> 
> No functional change.
> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
>  arch/x86/kernel/cpu/resctrl/monitor.c | 69 ++++++++++-----------------
>  1 file changed, 24 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 3ef339e405c2..1b6cb3bbc008 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -829,62 +829,41 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *dom_mbm)
>  	resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val);
>  }
>  
> -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d,
> -		       u32 closid, u32 rmid)
> +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_domain *d,
> +				 u32 closid, u32 rmid, enum resctrl_event_id evtid)
>  {
>  	struct rmid_read rr = {0};
>  
>  	rr.r = r;
>  	rr.d = d;
> +	rr.evtid = evtid;
> +	rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid);
> +	if (IS_ERR(rr.arch_mon_ctx)) {
> +		pr_warn_ratelimited("Failed to allocate monitor context: %ld",
> +				    PTR_ERR(rr.arch_mon_ctx));
> +		return;
> +	}
> +
> +	__mon_event_count(closid, rmid, &rr);
> +
> +	if (is_mba_sc(NULL))
> +		mbm_bw_count(closid, rmid, &rr);
> +

As I am staring at this more there seems to be an existing issue here ... note how
__mon_event_count()'s return value is not checked before mbm_bw_count() is called.
This means that mbm_bw_count() may run with rr.val of 0 that results in wraparound
inside it resulting in some unexpected bandwidth numbers. Since a counter read can fail
with a "Unavailable"/"Error" from hardware it is not deterministic how frequently this
issue can be encountered.

Skipping mbm_bw_count() if rr.val is 0 is one option ... that would keep the bandwidth
measurement static at whatever was the last successful read and thus not cause dramatic
changes by the software controller ... setting bandwidth to 0 if rr.val is 0 is another
option to reflect that bandwidth data is unavailable, but then the software controller should
perhaps get signal to not make adjustments? I expect there are better options? What do
you think?

Reinette