lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <df73713e-eacf-447e-a8e3-860a1e0606b4@arm.com>
Date: Tue, 11 Feb 2025 18:37:03 +0000
From: James Morse <james.morse@....com>
To: Peter Newman <peternewman@...gle.com>,
 Reinette Chatre <reinette.chatre@...el.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
 Babu Moger <Babu.Moger@....com>, shameerali.kolothum.thodi@...wei.com,
 D Scott Phillips OS <scott@...amperecomputing.com>,
 carl@...amperecomputing.com, lcherian@...vell.com,
 bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
 baolin.wang@...ux.alibaba.com, Jamie Iles <quic_jiles@...cinc.com>,
 Xin Hao <xhao@...ux.alibaba.com>, dfustini@...libre.com,
 amitsinght@...vell.com, David Hildenbrand <david@...hat.com>,
 Rex Nie <rex.nie@...uarmicro.com>, Dave Martin <dave.martin@....com>,
 Koba Ko <kobak@...dia.com>, Shanker Donthineni <sdonthineni@...dia.com>
Subject: Re: [PATCH v6 00/42] x86/resctrl: Move the resctrl filesystem code to
 /fs/resctrl

Hi Peter,

On 11/02/2025 14:36, Peter Newman wrote:
> On Mon, Feb 10, 2025 at 6:24 PM Reinette Chatre
> <reinette.chatre@...el.com> wrote:
>> I'd like to check in on what you said in [1]. It sounded as though you were
>> planning to look at the assignable counter work from an Arm/MPAM
>> perspective but that work has since progressed (now at V11 [2]) without
>> input from Arm/MPAM perspective. As I understand assignable counters may benefit
>> MPAM and looking close to settled but it is difficult to gain confidence
>> in an interface that may (may not?) be used for MPAM without any feedback
>> from Arm/MPAM. I am trying to prevent future issues when/if MPAM needs to use
>> this new interface and find it confusing that there does not seem to be
>> any input from MPAM side. What am I missing?
> 
> I've looked into monitor assignment on MPAM a little, so I'll share my findings.
> 
> Like with ABMC/BMEC, MPAM's counters can be configured to monitor
> reads, writes, or both, so there are situations where it would be
> useful to be able to assign 2 counters to the same group to be able to
> break down the bandwidth between reads and writes. However, a group's
> two assignment slots are called "local" and "total", so if MPAM's
> resources only support one of the two, then only one counter can be
> assigned to a group.

Wouldn't this be a problem on AMD too?
... specifically 2 counters with different configurations to the same group ...


I suspect it may be simpler to support complex things like that via perf.
I'd dropped that in favour of ABMC, but one platform has come out of the woodwork where
there are only monitors on the L2 - and I don't think we should expose new counter files
via resctrl...


> MPAM does not support any filters that would differentiate between
> traffic serviced by local or remote memory, so it's difficult to see
> an MBM event other than "total" ever being used.

The driver guesses from the topology! If the counters used are on the L3, chances are they
are local to a NUMA node. If they're on the memory controller, its probably total.

That code does need tightening up to check the cache boundaries match the numa boundaries
- but I haven't found a machine to test the bandwidth counters on at all yet.

I don't see how this would change what resctrl exposes - mbm_local and mbm_total already
exist. It's up to the MPAM driver to best match what it has with what it can exposed to
user-space...


> Multiple MSCs
> measuring memory bandwidth at an interconnect and a local memory
> controller could potentially be used to together to infer the "local"
> and "total" counts, but this would require the implementation to
> understand the platform-specific relationship between different types
> of MSCs and somehow present them as a single rdt_resource to resctrl.
> As best as I can tell, the MPAM driver today will choose "local" or
> "total"[1] for what it will present to the FS layer as an
> rdt_resource.

I think 'both' should fall out of that logic. It should keep moving the 'total' bandwidth
counter down the hierarchy until it reaches the memory controller.
I'd expect a platform that looks like this to have bandwidth monitors on the L3 (or
whatever cache matches the NUMA boundary) and bandwidth monitors on the memory controller.

Having two sets of bandwidth counters that measure different things in the same MSC is not
something that can be described by the firmware tables. (I did ask)

I think the logic here would be contained to the MPAM driver...


Thanks,

James

> Based on this, I would prefer the arch/fs refactoring changes go in
> first to give us more time to think about how better to abstract
> counter assignment on a non-RDTlike implementation. I believe finally
> settling on an arch/fs separation for the currently-supported feature
> set would make the counter assignment work clearer for everyone
> involved. Also, my own users have been using an implementation like
> this one successfully for over a year on ARM-based platforms while I'm
> still just experimenting with the usage model of ABMC on AMD hardware,
> so I consider the MPAM work to be more mature and would not like to
> see it delayed on account of ABMC.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ