[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aNp+7yjrs36/hSPS@e133380.arm.com>
Date: Mon, 29 Sep 2025 13:43:27 +0100
From: Dave Martin <Dave.Martin@....com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: linux-kernel@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
James Morse <james.morse@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>,
x86@...nel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be
per-arch
Hi Reinette,
On Thu, Sep 25, 2025 at 09:53:37PM +0100, Reinette Chatre wrote:
> Hi Dave,
>
> On 9/25/25 5:46 AM, Dave Martin wrote:
> > On Tue, Sep 23, 2025 at 10:27:40AM -0700, Reinette Chatre wrote:
> >> On 9/22/25 7:39 AM, Dave Martin wrote:
> >>> On Fri, Sep 12, 2025 at 03:19:04PM -0700, Reinette Chatre wrote:
> >>>> Hi Dave,
[...]
> >>>> Also please use upper case for acronym mba->MBA.
> >>>
> >>> Ack (the local custom in the MPAM code is to use "mba", but arguably,
> >>> the meaning is not quite the same -- I'll change it.)
> >>
> >> I am curious what the motivation is for the custom? Knowing this will help
> >> me to keep things consistent when the two worlds meet.
> >
> > I think this has just evolved over time. On the x86 side, MBA is a
> > specific architectural feature, but on the MPAM side the architecture
> > doesn't really have a name for the same thing. Memory bandwidth is a
> > concept, but a few different types of control are defined for it, with
> > different names.
> >
> > So, for the MPAM driver "mba" is more of a software concept than
> > something in a published spec: it's the glue that attaches to "MB"
> > resource as seen through resctrl.
> >
> > (This isn't official though; it's just the mental model that I have
> > formed.)
>
> I see. Thank you for the details. My mental model is simpler: write acronyms
> in upper case.
Generally, I agree, although I'm not sure whether that acronym belongs
in the MPAM-specific code.
For this patch, though, that's irrelevant. I've changed it to "MBA"
as requested.
[...]
> >> really sound as though the current interface works that great for MPAM. If I
> >> understand correctly this patch enables MPAM to use existing interface for
> >> its memory bandwidth allocations but doing so does not enable users to
> >> obtain benefit of hardware capabilities. For that users would want to use
> >> the new interface?
> >
> > In ideal world, probably, yes.
> >
> > Since not all use cases will care about full precision, the MB resource
> > (approximated for MPAM) should be fine for a lot of people, but I
> > expect that sooner or later somebody will want more exact control.
>
> ack.
OK.
[,,,]
> >> Considering the two statements:
> >> - "The available steps are no larger than this value."
> >> - "this value ... is not smaller than the apparent size of any individual rounding step"
> >>
> >> The "not larger" and "not smaller" sounds like all these words just end up saying that
> >> this is the step size?
> >
> > They are intended to be the same statement: A <= B versus
> > B >= A respectively.
>
> This is what I understood from the words ... and that made me think that it
> can be simplified to A = B ... but no need to digress ... onto the alternatives below ...
Right...
[...]
> > Instead, maybe we can just say something like:
> >
> > | The available steps are spaced at roughly equal intervals between the
> > | value reported by info/MB/min_bandwidth and 100%, inclusive. Reading
> > | info/MB/bandwidth_gran gives the worst-case precision of these
> > | interval steps, in per cent.
> >
> > What do you think?
>
> I find "worst-case precision" a bit confusing, consider for example, what
> would "best-case precision" be? What do you think of "info/MB/bandwidth_gran gives
> the upper limit of these interval steps"? I believe this matches what you
> mentioned a couple of messages ago: "The available steps are no larger than this
> value."
Yes, that works. "Worst case" implies a value judgement that smaller
steps are "better" then large steps, since the goal is control.
But your wording, to the effect that this is the largest (apparent)
step size, conveys all the needed information.
> (and "per cent" -> "percent")
( Note: https://en.wiktionary.org/wiki/per_cent )
(Though either is acceptable, the fused word has a more informal feel
to it for me. Happy to change it -- though your rewording below gets
rid of it anyway. (This word doesn't appear in resctrl.rst --
evertying is "percentage" etc.)
>
> > If that's adequate, then the wording under the definition of
> > "bandwidth_gran" could be aligned with this.
>
> I think putting together a couple of your proposals and statements while making the
> text more accurate may work:
>
> "bandwidth_gran":
> The approximate granularity in which the memory bandwidth
> percentage is allocated. The allocated bandwidth percentage
> is rounded up to the next control step available on the
> hardware. The available hardware steps are no larger than
> this value.
That's better, thanks. I'm happy to pick this up and reword the text
in both places along these lines.
> I assume "available" is needed because, even though the steps are not larger
> than "bandwidth_gran", the steps may not be consistent across the "min_bandwidth"
> to 100% range?
Yes -- or, rather, the steps _look_ inconsistent because they are
rounded to exact percentages by the interface.
I don't think we expect the actual steps in the hardware to be
irregular.
[...]
> Reinette
Cheers
---Dave
Powered by blists - more mailing lists