[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5928f9f-8c41-45d1-900f-6801eea775a0@intel.com>
Date: Thu, 20 Nov 2025 13:55:22 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Dave Martin <Dave.Martin@....com>
CC: <linux-kernel@...r.kernel.org>, Tony Luck <tony.luck@...el.com>, "James
Morse" <james.morse@....com>, Ben Horgan <ben.horgan@....com>, Thomas
Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav
Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter
Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>, <x86@...nel.org>,
<linux-doc@...r.kernel.org>
Subject: Re: [PATCH v2] x86,fs/resctrl: Factor MBA parse-time conversion to be
per-arch
Hi Dave,
On 11/20/25 9:04 AM, Dave Martin wrote:
> On Wed, Nov 19, 2025 at 07:05:10PM -0800, Reinette Chatre wrote:
>> On 11/18/25 7:47 AM, Dave Martin wrote:
>>> On Fri, Nov 14, 2025 at 02:17:53PM -0800, Reinette Chatre wrote:
>>>> On 10/31/25 8:41 AM, Dave Martin wrote:
>>>> With resctrl "coercing" the user provided value before providing it to the architecture
>>>> it controls these control steps to match what the documentation states above. If resctrl
>>>> instead provides the value directly to the architecture I see nothing preventing the
>>>> architecture from ignoring resctrl's "contract" with user space documented above and
>>>> using arbitrary control steps since it also controls resctrl_arch_get_config() that is
>>>> displayed directly to user space. What guarantee is there that resctrl_arch_get_config()
>>>> will display a value that is "approximately" min_bw + N * bw_gran? This seems like opening
>>>
>>> No guarantee, but there will only be one resctrl_arch_preconvert_bw()
>>> per arch. We'd expect the functions to be simple, so this doesn't feel
>>> like an excessive maintenance burden?
>>
>> Agree.
>>
>>>
>>> (All the resctrl_arch_foo() hooks have to do the right thing after all,
>>> otherwise nothing will work.)
>>>
>>>
>>> With this patch, resctrl_arch_preconvert_bw() and
>>> resctrl_arch_{update_domains,get_config}() between them must provide
>>> the correct behaviour.
>>
>> Right. This blurs the lines of responsibility. I interpret this as:
>> "resctrl fs makes promises to user space that resctrl fs *and* the architecture are
>> responsible to keep"
>
> Ack. It's a joint responsbility.
>
> Thinking about it, I don't think that the value returned by
> resctrl_arch_preconvert_bw() is used in any way except for storing it
> into rdt_ctrl_domain::staged_configs[] for later consumption by
> resctrl_arch_update_domains().
>
> If so, the meaning of this value is really arch-determined.
>
> It is convenient for the x86 implementation to convert this to hardware
> form at parse time, whereas, because of the way the MPAM code is
> structured, it suits the MPAM driver better to convert it later on.
>
> In the x86 case, it means that the arch code doesn't have to
> distinguish much between input values for different kinds of control:
> it's just a matter of writing precomputed values to MSRs. This keeps
> parts of the backend simple.
>
> In the MPAM case, it's more about encapsulating grungy arch-specific
> details in the driver. The common resctrl structures don't contain all
> the information needed. We might have to pull a lot of junk out of the
> private headers in order to the conversion in the context of an inline
> function called by the schemata parser.
>
> I think that it's likely that either all the conversion work is done in
> resctrl_arch_preconvert_bw() (e.g., x86), or all the work is done in
> the resctrl_arch_update_domains() backend (e.g., MPAM). There's
> probably no good reason ever to split the work into two parts.
>
> Does that make sense?
Good point. Yes.
>
>>>
>>> But even today, resctrl_arch_update_domains() and
>>> resctrl_arch_get_config() have to do the right thing in order for
>>> bandwidth control values to be handled correctly, as seen through the
>>> schemata interface.
>>
>> ack.
>>
>>>
>>> (x86's job is easy, because the generic interface between the resctrl
>>> core and the arch interface happens to be expressed in terms of raw x86
>>> MSR values due to the history. But other arches don't get the benefit
>>> of that.)
>>
>> That is just the benefit of being the first architecture to be supported.
>> If determined to cause headaches elsewhere it can surely change.
>>> The reason for this patch is the generic code can't do the right thing
>>> for MPAM, unless there is a hook into the arch code, arch-/hardware-
>>> specific knowledge is added in the core code, or unless a misleading
>>> bw_gran value is advertised.
>> Understood.
>>
>>>
>>> I tried to take the pragmatic approach, but I'm open to suggestions if
>>> you'd prefer this to be factored in another way.
>>
>> I am ok with this approach and will respond to the patch details separately.
>
> OK, thanks -- I see you already replied, so I'll respond there.
>
>>>
>>>> the door even wider for resctrl to become architecture specific ... with this change the
>>>> schemata file becomes a direct channel between user space and the arch that risks users
>>>> needing to tread carefully when switching between different architectures.
>>>
>>> There doesn't feel like a magic solution here.
>>>
>>> Exact bandwidth and flow control behaviour is extremely dependent on
>>> hardware topology and the design of interconnects and on dynamic system
>>> load. An interface that is generic and rigorously defined, yet also
>>> simple enough to be reasonably usable by portable software would
>>> probably not be implementable on any real hardware platform.
>>>
>>> So, if it's useful to have a generic interface at all, hardware and
>>> software are going to have to meet in the middle somewhere...
>>
>> I believe that we could also use above as a quote in support of the schema
>> description work.
>>
>>>
>>> (The historical behaviour -- and even the interface -- of MB is not
>>> generic either, right?)
>>
>> Right. Even so, I prefer to use MB as motivation to do things better rather
>> than an excuse to make things worse.
>>
>> Reinette
>
> Cheap shot, I know.
>
> I guess that more is known now about what kinds of control behaviour
> are likely to exist than was known about when resctrl was first
> developed... though we still don't have a crystal ball.
>
> My aim is generalise enough to cover most of what we know about today,
> while not inventing pie-in-the-sky abstractions that may never be
> fully used... It's a balancing act, though.
>
> There's a fine like between a "random nonportable junk" schema and
> reasonable exposure of an architecture-specific control that makes
> sense within resctrl but is unlikely to map onto other architectures.
>
> We should certainly push back on the latter, but could it be appropriate
> to expose arch-specific control types in some situations? I don't like
> to rule it out absolutely.
It is difficult to fully reason about merits without an example as reference.
We'll have better information when faced with enabling of such a control.
Some initial thoughts are: First, while a control may start as unique to an arch
we cannot predict that such a control will remain so. Second, with the new
schema descriptions it should be possible to easily add support for a
new control type. Of course, the first arch that supports the control type
has benefit of establishing the needed parameters. It would be ideal if a new
control could be mapped so that it appears all architectures support it but
I do not think resctrl should make that a requirement for support of a new
control.
Reinette
Powered by blists - more mailing lists