lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aR9TBNAJqQNpGjMh@e133380.arm.com>
Date: Thu, 20 Nov 2025 17:42:28 +0000
From: Dave Martin <Dave.Martin@....com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: linux-kernel@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
	James Morse <james.morse@....com>, Ben Horgan <ben.horgan@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>,
	x86@...nel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to
 be per-arch

On Thu, Nov 20, 2025 at 08:46:24AM -0800, Reinette Chatre wrote:
> Hi Dave,
> 
> On 10/31/25 8:41 AM, Dave Martin wrote:
> > The control value parser for the MB resource currently coerces the
> > memory bandwidth percentage value from userspace to be an exact
> > multiple of the rdt_resource::resctrl_membw::bw_gran parameter.
> > 
> > On MPAM systems, this results in somewhat worse-than-worst-case
> > rounding, since the bandwidth granularity advertised to resctrl by the
> > MPAM driver is in general only an approximation to the actual hardware
> > granularity on these systems, and the hardware bandwidth allocation
> > control value is not natively a percentage -- necessitating a further
> > conversion in the resctrl_arch_update_domains() path, regardless of the
> > conversion done at parse time.
> > 
> > Allow the arch to provide its own parse-time conversion that is
> > appropriate for the hardware, and move the existing conversion to x86.
> > This will avoid accumulated error from rounding the value twice on MPAM
> > systems.
> > 
> > Clarify the documentation, but avoid overly exact promises.
> > 
> > Clamping to bw_min and bw_max still feels generic: leave it in the core
> > code, for now.
> > 
> > No functional change.
> 
> Please use max line length available. Changelogs have to conform before merge
> and having patch ready saves this work.

Fixed.  Sorry, old habits die hard.

I won't bother with the tearoff text, though, it that's OK -- that
won't go into git.

[...]

> > diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> > index b7f35b07876a..fbbcf5421346 100644
> > --- a/Documentation/filesystems/resctrl.rst
> > +++ b/Documentation/filesystems/resctrl.rst

[...]

> > @@ -737,8 +736,10 @@ The minimum bandwidth percentage value for each cpu model is predefined
> >  and can be looked up through "info/MB/min_bandwidth". The bandwidth
> >  granularity that is allocated is also dependent on the cpu model and can
> >  be looked up at "info/MB/bandwidth_gran". The available bandwidth
> > -control steps are: min_bw + N * bw_gran. Intermediate values are rounded
> > -to the next control step available on the hardware.
> > +control steps are, approximately, min_bw + N * bw_gran.  The steps may
> > +appear irregular due to rounding to an exact percentage: bw_gran is the
> > +maximum interval between the percentage values corresponding to any two
> > +adjacent steps in the hardware.
> >  
> >  The bandwidth throttling is a core specific mechanism on some of Intel
> >  SKUs. Using a high bandwidth and a low bandwidth setting on two threads
> 
> The documentation changes look good to me. Thank you.

OK, sounds good.

[...]

> > diff --git a/fs/resctrl/ctrlmondata.c b/fs/resctrl/ctrlmondata.c
> > index 0d0ef54fc4de..26e3f7c88c86 100644
> > --- a/fs/resctrl/ctrlmondata.c
> > +++ b/fs/resctrl/ctrlmondata.c
> > @@ -35,8 +35,8 @@ typedef int (ctrlval_parser_t)(struct rdt_parse_data *data,
> >  /*
> >   * Check whether MBA bandwidth percentage value is correct. The value is
> >   * checked against the minimum and max bandwidth values specified by the
> > - * hardware. The allocated bandwidth percentage is rounded to the next
> > - * control step available on the hardware.
> > + * hardware. The allocated bandwidth percentage is converted as
> > + * appropriate for consumption by the specific hardware driver.
> 
> The text looks good but adjusting the right margin mid-paragraph seems unnecessary?
> 

I was trying to follow the prevailing line length; but I guess the
lines existing lines were already shortened by flowing the text, so I
misjudged it.

Fixed locally.

Is there a view on how to flow new text? (not so applicable here)

Keeping the lines a bit short (say, 72 chars) means that minor edits
and typo fixes can be applied without extraneous diff noise most of the
time.  I find reviewing an entire re-flowed paragraph annoying when
there is really just a one-word change buried in the middle of it
somewhere.

Equally, I try to avoid reflowing text for minor edits unless
absolutely necessary; poking a few chars over 80 seems tolerable in
that situation, but I prefer it not to go too far...

> >   */
> >  static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
> >  {
> > @@ -69,7 +69,7 @@ static bool bw_validate(char *buf, u32 *data, struct rdt_resource *r)
> >  		return false;
> >  	}
> >  
> > -	*data = roundup(bw, (unsigned long)r->membw.bw_gran);
> > +	*data = resctrl_arch_preconvert_bw(bw, r);
> >  	return true;
> >  }
> >  
> > diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> > index a7d92718b653..1fb6e2118b61 100644
> > --- a/include/linux/resctrl.h
> > +++ b/include/linux/resctrl.h
> > @@ -485,6 +485,20 @@ bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
> >   */
> >  int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
> >  
> > +/*
> > + * Convert a bandwidth control value to the appropriate form for
> > + * consumption by the hardware driver for resource r.
> 
> When being as descriptive, please switch to proper kernel-doc instead. There are
> a couple of examples for reference in this header file.

This comment was pretty trivial to begin with, but grew.

> > + *
> > + * For example, it simplifies the x86 RDT implementation to round the
> > + * value to a suitable step here and then treat the resulting value as
> > + * opaque when programming the hardware MSRs later on.
> 
> This "For example" can be dropped.

Done.

> > + *
> > + * Architectures for which this pre-conversion hook is not useful
> > + * should supply an implementation of this function that just returns
> > + * val unmodified.
> > + */

I've fleshed this out a little, as follows:

--8<--

/**
 * resctrl_arch_preconvert_bw() - Convert a bandwidth control value to the
 *				  appropriate form for consumption by the
 *				  hardware driver for resource r.
 * @val:	Control value written to the schemata file by userspace.
 * @r:		Resource whose schema was written.
 *
 * Return:	The converted value.
 *
 * It simplifies the x86 RDT implementation to round the value to a suitable
 * step at parse time and then treat the resulting value as opaque when
 * programming the hardware MSRs later on.  In this situation, this hook is the
 * correct place to perform the conversion.
 *
 * Architectures for which this pre-conversion hook is not useful should supply
 * an implementation of this function that just returns @val unmodified.
 */
u32 resctrl_arch_preconvert_bw(u32 val, const struct rdt_resource *r);

-->8--

Does that look like enough?

Cheers
---Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ