lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0c2beb97-437c-4315-9aff-14c650297565@intel.com>
Date:   Mon, 4 Dec 2023 14:15:06 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Tony Luck <tony.luck@...el.com>
CC:     "babu.moger@....com" <babu.moger@....com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        Peter Newman <peternewman@...gle.com>,
        "Jonathan Corbet" <corbet@....net>,
        Shuah Khan <skhan@...uxfoundation.org>,
        "x86@...nel.org" <x86@...nel.org>,
        Shaopeng Tan <tan.shaopeng@...itsu.com>,
        James Morse <james.morse@....com>,
        Jamie Iles <quic_jiles@...cinc.com>,
        "Randy Dunlap" <rdunlap@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "patches@...ts.linux.dev" <patches@...ts.linux.dev>
Subject: Re: [PATCH v5] x86/resctrl: Add event choices for mba_MBps

Hi Tony,

On 12/4/2023 1:08 PM, Tony Luck wrote:
> On Mon, Dec 04, 2023 at 12:03:23PM -0800, Reinette Chatre wrote:
>> Hi Tony,
>>
>> On 12/4/2023 11:45 AM, Luck, Tony wrote:
>>>> Yes. I saw the thread. Even then I feel having two similar options can
>>>> cause confusion. I feel it is enough just to solve the original problem.
>>>> Giving more options to a corner cases is a overkill in my opinion.
>>>
>>> The "original" problem was systems without "local" bandwidth event. I
>>> wanted to give a way for users of mba_MBps to still have some way to
>>> use it (assuming that "total" bandwidth event was present).
>>>
>>> Reinette suggested that some people might want to use "total", even
>>> on systems that support "local". I firmly agree with that.  It is easy to
>>> construct scenarios where most bandwidth is to a remote node. using
>>> "local" event will do nothing to throttle in these case. I'm not at all sure
>>> why "local" event was picked. There's nothing in the LKML threads to
>>> give clues.
>>>
>>> I proposed a mount option "total" as a modifier to be used in conjunction
>>> with "mba_MBps". Reinette said it was too generic. Her suggestion was
>>> to add "mba_MBps_total" to be used instead of "mba_MBps".
>>
>> No, it cannot be used instead of "mba_MBps". My intention was for it to be
>> in addition to existing "mba_MBps" since taking "mba_MBps" away would be
>> considered breaking user space ABI.
> 
> I was unclear. The mba_MBps option must remain as legacy ABI. My
> "instead of" was intended to convey that a user wanting total bandwidth
> would use:
> 
> # mount -t resctrl -o mba_MBps_total resctrl /sys/fs/resctrl
> 
> rather than the new option being a modifier and requiring both
> the legacy option and the modifier like this:
> 
> # mount -t resctrl -o mba_MBps,mba_MBps_total resctrl /sys/fs/resctrl
> 
> which seems overly verbose.
> 

I misunderstood your comment. Thank you for clarifying.

>>
>> Even so ...
>>
>>>
>>> Is that where I should have gone, instead of "mba_MBps={local|total}"?
>>
>> While I did propose "mba_MBps_total" (in addition to "mba_MBps") I do
>> recognize your comment that a new key of mba_MBps_event does give more
>> flexibility if different events become available in future. Emphasis is
>> on "different" since I do not believe the parsing can support multiple
>> events and thus mba_MBps_event cannot be treated as a general bucket for all
>> mba_sc options, just different events guiding the feedback loop.
>>
>> "mba_MBps" must be kept and having it continue to use local bw as default,
>> but total bw on systems that do not support local bw seems appropriate,
>> (which is what this patch does).
> 
> So we defintely have:
> 
> "mba_MBps" - defaults to local, on systems without local may switch to
> total if that is available. Should this switch get a pr_info()? Or just happen
> silently (as I've done in patches so far).

I do not think a message is required ... I cannot see how it provides information
that user space does not already know. It surely does no harm and I would not
object if it is added. Even so, I do not think a kernel message should be what
is relied on to share with user space what guides the feedback loop. To that end
I think the mount options combined with the system capabilities (learned via
resctrl) provide that information.

Please do note that if the solution of this version is maintained then
rdtgroup_show_options() needs to be updated. With that done, user space should
be able to determine at any time during runtime (no matter if kernel log has been
cleared) which event is being used.

> and need to come to agreement on which of these to implement:
> 
> A) "mba_MBps_total" - forces use of total. Fails the mount if total is not
>    available.

The "cons" of this is that (a) user is not able to prevent the failover,
and (b) harder to support future events (which are unknown and difficult
to prepare for anyway).

> 
> B) "mba_MBps={local|total)" forces use of chosen event, fails if event
> is unavailable.

I assume you mean "mba_MBps_event={local|total}". This is my preference but
I would like to learn more about Babu's "overkill" argument. I do believe that
(B) is well motivated in [1] and has no impact on AMD.
My uncertainty here (considering one motivation is to future proof it against
adding more events) is the generic "local" and "total" names. Even so, all
I could come up with are "local_bw" and "total_bw", which I do not think are
improvements.

> 
> C) Something else.
> 
> D) Don't provide any way to force use of total event.

(D) is what would be needed at minimum. 

In support of Babu's comment I can see how having mba_MBps as well as
mba_MBps_event can cause confusion. To help with this the documentation could
be expanded with more detailed content and also examples.

Reinette

[1] https://lore.kernel.org/lkml/SJ1PR11MB60839F92B1C15A659CD32478FC86A@SJ1PR11MB6083.namprd11.prod.outlook.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ