[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a9b49861-ff36-4550-ad29-a737eb5c1d63@intel.com>
Date: Tue, 28 Oct 2025 16:17:05 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Dave Martin <Dave.Martin@....com>, <linux-kernel@...r.kernel.org>
CC: Tony Luck <tony.luck@...el.com>, James Morse <james.morse@....com>, "Chen,
Yu C" <yu.c.chen@...el.com>, Thomas Gleixner <tglx@...utronix.de>, "Ingo
Molnar" <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, "H. Peter Anvin" <hpa@...or.com>, "Jonathan
Corbet" <corbet@....net>, <x86@...nel.org>
Subject: Re: [RFC] fs/resctrl: Generic schema description
Hi Dave,
On 10/24/25 4:12 AM, Dave Martin wrote:
> Hi all,
>
> Going forward, a single resctrl resource (such as memory bandwidth) is
> likely to require multiple schemata, either because we want to add new
> schemata that provide finer control, or because the hardware has
> multiple controls, covering different aspects of resource allocation.
>
> The fit between MPAM's memory bandwidth controls and the resctrl MB
> schema is already awkward, and later Intel RDT features such as Region
> Aware Memory Bandwidth Allocation are already pushing past what the MB
> schema can describe. Both of these can involve multiple control
> values and finer resolution than the 100 steps offered by the current
> "MB" schema.
>
> The previous discussion went off in a few different directions [1], so
> I want to focus back onto defining an extended schema description that
> aims to cover the use cases that we know about or anticipate today, and
> allows for future extension as needed.
>
> (A separate discussion is needed on how new schemata interact with
> previously-defined schemata (such as the MB percentage schema).
> suggest we pause that discussion for now, in the interests of getting
> the schema description nailed down.)
ok, but let's keep this as "open #1"
> Following on from the previous mail thread, I've tried to refine and
> flesh out the proposal for schema descriptions a bit, as follows.
>
> Proposal:
>
> * Split resource names and schema names in resctrlfs.
>
> Resources will be named for the unique, existing schema for each
> resource.
Are you referring to the implementation or how things are exposed to user
space? I am trying to understand how the existing L3CODE/L3DATA schemata
fit in ... they are presented to user space as two separate resources since
they each have their own directory in "info" while internally they are
schema of the L3 resource.
Just trying to understand if you are talking about reverting
https://lore.kernel.org/all/20210728170637.25610-1-james.morse@arm.com/ ?
The current implementation appears to match this proposal so we may need to
have special cases to keep CDP backwards compatible.
SMBA may also need some extra care ... especially if other architectures start
to allocate memory bandwidth to CXL resource via their "MB" resource.
> The existing schema will keep its name (the same as the resource
> name), and new schemata defined for a resource will include that
> name as a prefix (at least, by default).
>
> So, for example, we will have an MB resource with a schema called
> MB (the schema that we have already). But we may go on to define
> additional schemata for the MB resource, with names such MB_MAX,
> etc.
>
> * Stop adding new schema description information in the top-level
> info/<resource>/ directory in resctrlfs.
>
> For backwards compatibilty, we can keep the existing property
> files under the resource info directory to describe the previously
> defined resource, but we seem to need something richer going
> forward.
>
> * Add a hierarchy to list all the schemata for each resource, along
> with their properties. So far, the proposal looks like this,
> taking the MB resource as an example:
>
> info/
> └─ MB/
> └─ resource_schemata/
> ├─ MB/
> ├─ MB_MIN/
> ├─ MB_MAX/
> ┆
>
> Here, MB, MB_MIN and MB_MAX are all schemata for the "MB" resource.
> In this proposal, what these just dummy schema names for
> illustration purposes. The important thing is that they all
> control aspects of the "MB" resource, and that there can be more
> than one of them.
>
> It may be appropriate to have a nested hierarchy, where some
> schemata are presented as children of other schemata if they
> affect the same hardware controls. For now, let's put this issue
> on one side, and consider what properties should be advertsed for
> each schema.
ok to put this aside but I think we should keep including it, "open #2" ?
>
> * Current properties that I think we might want are:
>
> info/
> └─ SOME_RESOURCE/
> └─ resource_schemata/
> ├─ SOME_SCHEMA/
> ┆ ├─ type
> ├─ min
> ├─ max
> ├─ tolerance
> ├─ resolution
> ├─ scale
> └─ unit
>
> (I've tweaked the properties a bit since previous postings.
> "type" replaces "map"; "scale" is now the unit multiplier;
> "resolution" is now a scaling divisor -- details below.)
>
> I assume that we expose the properties in individual files, but we
> could also combine them into a single description file per schema,
> per resource or (possibly) a single global file.
> (I don't have a strong view on the best option.)
>
>
> Either way, the following set of properties may be a reasonable
> place to start:
>
>
> type: the schema type, followed by optional flag specifiers:
>
> - "scalar": a single-valued numeric control
>
> A mandatory flag indicates how the control value written to
> the schemata file is converted to an amount of resource for
> hardware regulation.
>
> The flag "linear" indicates a linear mapping.
>
> In this case, the amount of resource E that is actually
> allocated is derived from the control value C written to the
> schemata file as follows:
>
> E = C * scale * unit / resolution
>
> Other flags values could be defined later, if we encounter
> hardware with non-linear controls.
>
> - "bitmap": a bitmap control
>
> The optional flag "sparse" is present if the control accepts
> sparse bitmaps.
>
> In this case, E = bitmap_weight(C) * scale * unit / resolution.
>
> As before, each bit controls access to a specific chunk of
> resource in the hardware, such as a group of cache lines. All
> chunks are equally sized.
>
> (Different CTRL_MON groups may still contend within the
> allocation E, when they have bits in common between their
> bitmaps.)
Would it not be simpler to have the files/properties depend on the
schema type? It almost seems as though some of the properties are forced
to have some meaning for bitmap when they do not seem to be needed. Instead,
for a bitmap type there can be bitmap specific properties like, for example,
bit_usage. This may also create more flexibility when there is a future
mapping function needed that depends on some new property?
Reinette
Powered by blists - more mailing lists