[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4ffceb95-0e5e-42d6-bbc3-a2cb8a3fd65a@intel.com>
Date: Tue, 30 Sep 2025 19:02:05 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: "Luck, Tony" <tony.luck@...el.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Chatre,
Reinette" <reinette.chatre@...el.com>, James Morse <james.morse@....com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Jonathan Corbet <corbet@....net>,
"x86@...nel.org" <x86@...nel.org>, "linux-doc@...r.kernel.org"
<linux-doc@...r.kernel.org>, Dave Martin <Dave.Martin@....com>
Subject: Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be
per-arch
On 9/30/2025 12:23 AM, Luck, Tony wrote:
>>> This seems to be applicable as it introduces the new interface
>>> while preserving forward compatibility.
>>>
>>> One minor question is that, according to "Figure 6-5. MBA Optimal
>>> Bandwidth Register" in the latest RDT specification, the maximum
>>> value ranges from 1 to 511.
>>> Additionally, this bandwidth field is located at bits 48 to 56 in
>>> the MBA Optimal Bandwidth Register, and the range for
>>> this segment could be 1 to 8191. Just wonder if it would be
>
> 48..56 is still 9 bits, so max value is 511.
>
Ah I see, I overlooked this.
>>> possible that the current maximum value of 512 may be extended
>>> in the future? Perhaps we could explore a method to query the maximum upper
>>> limit from the ACPI table or register, or use CPUID to distinguish between
>>> platforms rather than hardcoding it. Reinette also mentioned this in another
>>> thread.
>
> I think 511 was chosen as "bigger than we expect to ever need" and 9-bits
> allocated in the registers based on that.
>
OK, got it.
> Initial implementation may use 255 as the maximum - though I'm pushing on
> that a bit as the throttle graph at the early stage is fairly linear from "1" to some
> value < 255,
> when bandwidth hits maximum, then flat up to 255.
> If things stay that way, I'm arguing that the "Q" value enumerated in the ACPI
> table should be the value where peak bandwidth is hit
I see. If I understand correctly, the BIOS needs to pre-train the system to
find this Q. However, if the BIOS cannot provide this Q, would it be
feasible
for the user to provide it? For example, the user could saturate the memory
bandwidth, gradually increase MB_MAX, and finally find the Q_max where the
memory bandwidth no longer increases. The user could then adjust the max
field in the info file.
> (though this is complicated
> because workloads with different mixes of read/write access have different
> throttle graphs).
>
Does this mean read and write operations have different Q values to saturate
the memory bandwidth? For example, if the workload is all reads, there
is a Q_r;
if the workload is all writes, there is another Q_w. In that case, maybe we
could choose the maximum of Q_r and Q_w (max(Q_r, Q_w)).
thanks,
Chenyu
Powered by blists - more mailing lists