[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02647276-dea2-47b5-a271-7f02666e0492@intel.com>
Date: Mon, 29 Sep 2025 17:19:32 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>
CC: <linux-kernel@...r.kernel.org>, Reinette Chatre
<reinette.chatre@...el.com>, James Morse <james.morse@....com>, "Thomas
Gleixner" <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, "Borislav
Petkov" <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>, "H. Peter
Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>, <x86@...nel.org>,
<linux-doc@...r.kernel.org>, Dave Martin <Dave.Martin@....com>
Subject: Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be
per-arch
On 9/26/2025 6:58 AM, Luck, Tony wrote:
> On Mon, Sep 22, 2025 at 04:04:40PM +0100, Dave Martin wrote:
>> Hi again,
>>
>> On Fri, Sep 12, 2025 at 03:19:04PM -0700, Reinette Chatre wrote:
>>
>> [...]
>>
[snip]
>> For example, userspace might write the following:
>>
>> MB_MIN: 0=16, 1=16
>> MB_MAX: 0=32, 1=32
>>
>> Which might then read back as follows:
>>
>> MB: 0=50, 1=50
>> # MB_HW: 0=32, 1=32
>> # MB_MIN: 0=16, 1=16
>> # MB_MAX: 0=32, 1=32
>>
>>
>> I haven't tried to develop this idea further, for now.
>>
>> I'd be interested in people's thoughts on it, though.
>
> Applying this to Intel upcoming region aware memory bandwidth
> that supports 255 steps and h/w min/max limits.
> We would have info files with "min = 1, max = 255" and a schemata
> file that looks like this to legacy apps:
>
> MB: 0=50;1=75
> #MB_HW: 0=128;1=191
> #MB_MIN: 0=128;1=191
> #MB_MAX: 0=128;1=191
>
> But a newer app that is aware of the extensions can write:
>
> # cat > schemata << 'EOF'
> MB_HW: 0=10
> MB_MIN: 0=10
> MB_MAX: 0=64
> EOF
>
> which then reads back as:
> MB: 0=4;1=75
> #MB_HW: 0=10;1=191
> #MB_MIN: 0=10;1=191
> #MB_MAX: 0=64;1=191
>
> with the legacy line updated with the rounded value of the MB_HW
> supplied by the user. 10/255 = 3.921% ... so call it "4".
>
This seems to be applicable as it introduces the new interface
while preserving forward compatibility.
One minor question is that, according to "Figure 6-5. MBA Optimal
Bandwidth Register" in the latest RDT specification, the maximum
value ranges from 1 to 511.
Additionally, this bandwidth field is located at bits 48 to 56 in
the MBA Optimal Bandwidth Register, and the range for
this segment could be 1 to 8191. Just wonder if it would be
possible that the current maximum value of 512 may be extended
in the future? Perhaps we could explore a method to query the maximum
upper limit from the ACPI table or register, or use CPUID to distinguish
between platforms rather than hardcoding it. Reinette also mentioned
this in another thread.
Thanks,
Chenyu
[1]
https://www.intel.com/content/www/us/en/content-details/851356/intel-resource-director-technology-intel-rdt-architecture-specification.html
> The region aware h/w supports separate bandwidth controls for each
> region. We could hope (or perhaps update the spec to define) that
> region0 is always node-local DDR memory and keep the "MB" tag for
> that.
>
> Then use some other tag naming for other regions. Remote DDR,
> local CXL, remote CXL are the ones we think are next in the h/w
> memory sequence. But the "region" concept would allow for other
> options as other memory technologies come into use.
>
>>
>> Cheers
>> ---Dave
>
Powered by blists - more mailing lists