lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6941deab-0bec-41e1-975b-274fc9d6b8e1@intel.com>
Date: Mon, 16 Sep 2024 15:33:27 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: James Morse <james.morse@....com>, <x86@...nel.org>,
	<linux-kernel@...r.kernel.org>
CC: Fenghua Yu <fenghua.yu@...el.com>, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, H Peter Anvin
	<hpa@...or.com>, Babu Moger <Babu.Moger@....com>,
	<shameerali.kolothum.thodi@...wei.com>, D Scott Phillips OS
	<scott@...amperecomputing.com>, <carl@...amperecomputing.com>,
	<lcherian@...vell.com>, <bobo.shaobowang@...wei.com>,
	<tan.shaopeng@...itsu.com>, <baolin.wang@...ux.alibaba.com>, Jamie Iles
	<quic_jiles@...cinc.com>, Xin Hao <xhao@...ux.alibaba.com>,
	<peternewman@...gle.com>, <dfustini@...libre.com>, <amitsinght@...vell.com>,
	David Hildenbrand <david@...hat.com>, Rex Nie <rex.nie@...uarmicro.com>,
	"Dave Martin" <dave.martin@....com>
Subject: Re: [PATCH v4 04/39] x86/resctrl: Use schema type to determine how to
 parse schema values

Hi James,

On 9/6/24 11:06 AM, James Morse wrote:
> Hi Reinette,
> 
> On 14/08/2024 04:56, Reinette Chatre wrote:
>> On 8/2/24 10:28 AM, James Morse wrote:
>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>> index 9ca542a8e2d4..57c88e1c2adf 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> 
>>> @@ -192,6 +191,18 @@ enum resctrl_scope {
>>>        RESCTRL_L3_NODE,
>>>    };
>>>    +/**
>>> + * enum resctrl_schema_fmt - The format user-space provides for a schema.
>>> + * @RESCTRL_SCHEMA_BITMAP:    The schema is a bitmap in hex.
>>> + * @RESCTRL_SCHEMA_PERCENTAGE:    The schema is a decimal percentage value.
>>> + * @RESCTRL_SCHEMA_MBPS:    The schema is a decimal MBps value.
>>> + */
>>> +enum resctrl_schema_fmt {
>>> +    RESCTRL_SCHEMA_BITMAP,
>>> +    RESCTRL_SCHEMA_PERCENTAGE,
>>> +    RESCTRL_SCHEMA_MBPS,
>>> +};
>>> +
>>
>> I believe that the choice of RESCTRL_SCHEMA_PERCENTAGE and RESCTRL_SCHEMA_MBPS has
>> potential for significant confusion. The closest place to where user space can enter
>> a MBps value (which is actually MiBps) is on Intel when resctrl is mounted with mba_MBps,
>> and as per above it will have the "RESCTRL_SCHEMA_PERCENTAGE" format.
> 
> It was AMD's QOS that I was thinking of. To pick the example from the documentation:
> | For example, to allocate 2GB/s limit on the first cache id:
> |    MB:0=2048;1=2048;2=2048;3=2048
> 
> How does user-space know that its not a percentage? Suck it and see...
> If those values aren't in MB/s ... I'm not sure what they are.

I agree that this is an issue.

> 
> 
> The schema format isn't exposed to user-space. (I think we should do that).
> I agree if it were, we'd need to report MBps/MiBps for the mba_sc case.

I agree that it would be ideal to expose the schema format to user space.

>> What is considered
>> here as RESCTRL_SCHEMA_MBPS also cannot really be considered as "MBPS" since it is used to
>> cover AMD's values that are "multiples of one eighth GB/s". Any new resource that
>> _actually_ uses MBPS will thus not be able to use RESCTRL_SCHEMA_MBPS.
> 
> Isn't this already covered by the granularity?
> AMD platforms are unfortunately reporting '1' as the granularity.
>  From the 1/8th GB/s it sounds like the granularity should really be 128 MB/s.

Unfortunately the existing interface makes consistent use of granularity (by user
space) difficult.

As the doc highlights:
	"bandwidth_gran":
		...
		available bandwidth control steps are
		min_bandwidth + N * bandwidth_gran.


You are correct that setting AMD's granularity to 128 MB/s would make the expected
bandwidth clear to user space. The problem is that the existing schemata displays to
user space and expects from user space the value of "N" for MB on AMD while it
displays to user space and expects from user space the value of "min_bandwidth + N *
bandwidth_gran" from user space for MB on Intel.

> 
> Any platform that did have a real MB/s control could report whatever granularity it really
> has here.
> 
> 
>> Considering that RESCTRL_SCHEMA_PERCENTAGE and RESCTRL_SCHEMA_MBPS use the same parser,
>> could "RESCTRL_SCHEMA_RANGE" be more fitting? I acknowledge that it is very generic and
>> better
>> ideas are welcome. A "range" does seem to be appropriate considering the later patch
>> (x86/resctrl:
> 
>> Add max_bw to struct resctrl_membw) that codes an explicit max.
> 
> Yes because the AMD platforms have a firmware-advertised maximum value, I presume that is
> an achievable bandwidth. (otherwise why advertise it to user-space!).
> (I'm not sure why data_width is 4 - it looks like that was the hard coded based on one
>   platforms limit ... that probably needs fixing too)
> 
> But if you had a MBps control, I'm not sure that implies you know what the maximum
> achievable bandwidth is. The control may have a range based on some power of two, which is
> unrelated to the bandwidth.

I believe that is the current resctrl mechanism, with the huge caveat that the only
implementation is actually software defined range limit.

> In the arm world the hardware never knows what the achievable bandwidth is - that is
> something only the SoC designer can work out.
> 
> 
> I agree we can parse both values as a range, as resctrl doesn't need to interpret the
> values, its just whatever the hardware does.
> 
> I did it like this as I'd like to expose "percentage/MBps/bitmap" to user-space so that
> control types user-space doesn't recognised can be programmed. If arm64 and riscv each
> want to add new schema types, it would be good to do it in a way that is low-impact for
> the resctrl filesystem code, and unaware user-space can do something with.
> percentage/MBps/bitmap are the control schemes we already have.

Adding bitmap will need more thought, since resctrl does not support that today for
MB there is some flexibility when adding support for it.

> 
> The MBps/range difference matters if the 'maximum bandwidth' isn't actually an achievable
> number e.g. ~0, then user-space can't treat it as some kind of percentage.

This last statement is not clear to me. It sounds as though you are considering the
underlying implementation treating provided bandwidth as percentage as opposed to
the interface itself being bandwidth or percentage? hmmm ... or are you referring to
the software controller?

I wonder if bandwidth and percentage can continue using the same interface as today
by using bandwidth_gran as you hinted to ... but with the backward compatibility of
needing to distinguish between schemata interface expecting the total or the steps.
I do not know what (if any) impact there will be if AMD's bandwidth_gran is
changed to 128 MB/s.

Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ