lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCi0fR5sUVsjXi25XHBrhn3whFvKDpEvUGSM5Hjd2LZP6w@mail.gmail.com>
Date: Mon, 19 Jan 2026 13:47:52 +0100
From: Peter Newman <peternewman@...gle.com>
To: James Morse <james.morse@....com>
Cc: Ben Horgan <ben.horgan@....com>, amitsinght@...vell.com, baisheng.gao@...soc.com, 
	baolin.wang@...ux.alibaba.com, carl@...amperecomputing.com, 
	dave.martin@....com, david@...nel.org, dfustini@...libre.com, 
	fenghuay@...dia.com, gshan@...hat.com, jonathan.cameron@...wei.com, 
	kobak@...dia.com, lcherian@...vell.com, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, punit.agrawal@....qualcomm.com, 
	quic_jiles@...cinc.com, reinette.chatre@...el.com, rohit.mathew@....com, 
	scott@...amperecomputing.com, sdonthineni@...dia.com, 
	tan.shaopeng@...itsu.com, xhao@...ux.alibaba.com, catalin.marinas@....com, 
	will@...nel.org, corbet@....net, maz@...nel.org, oupton@...nel.org, 
	joey.gouly@....com, suzuki.poulose@....com, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v3 29/47] arm_mpam: resctrl: Pick classes for use as mbm counters

Hi James,

On Mon, Jan 19, 2026 at 1:04 PM James Morse <james.morse@....com> wrote:
>
> Hi Peter,
>
> On 15/01/2026 15:49, Peter Newman wrote:
> > On Mon, Jan 12, 2026 at 6:02 PM Ben Horgan <ben.horgan@....com> wrote:
> >> From: James Morse <james.morse@....com>
> >>
> >> resctrl has two types of counters, NUMA-local and global. MPAM has only
> >> bandwidth counters, but the position of the MSC may mean it counts
> >> NUMA-local, or global traffic.
> >>
> >> But the topology information is not available.
> >>
> >> Apply a heuristic: the L2 or L3 supports bandwidth monitors, these are
> >> probably NUMA-local. If the memory controller supports bandwidth monitors,
> >> they are probably global.
>
> > Are remote memory accesses not cached? How do we know an MBWU monitor
> > residing on a cache won't count remote traffic?
>
> It will, yes you get double counting. Is forbidding both mbm_total and mbm_local preferable?
>
> I think this comes from 'total' in mbm_total not really having the obvious meaning of the
> word:
> If I have CPUs in NUMA-A and no memory controllers, then NUMA-B has no CPUs, and all the
> memory-controllers.
> With MPAM: we've only got one bandwidth counter, it doesn't know where the traffic goes
> after the MSC. mbm-local on the L3 would reflect all the bandwidth, and mbm-total on the
> memory-controllers would have the  same number.
> I think on x86 mbm_local on the CPUs would read zero as zero traffic went to the 'local'
> memory controller, and mbm_total would reflect all the memory bandwidth. (so 'total'
> really means 'other')

Our software is going off the definition from the Intel SDM:

"This event monitors the L3 external bandwidth satisfied by the local
memory. In most platforms that support this event, L3 requests are
likely serviced by a memory system with non-uniform memory
architecture. This allows bandwidth to off-package memory resources to
be tracked by subtracting local from total bandwidth (for instance,
bandwidth over QPI to a memory controller on another physical
processor could be tracked by subtraction).

On NUMA-capable hardware that can support this event where all memory
is local, mbm_local == mbm_total, but in practice you can't read them
at the same time from userspace, so if you read mbm_total first,
you'll probably get a small negative result for remote bandwidth.

>
> I think what MPAM is doing here is still useful as a system normally has both CPUs and
> memory controllers in the NUMA nodes, and you can use this to spot a control/monitor group
> on a NUMA-node that is hammering all the memory (outlier mbm_local), or the same where a
> NUMA-node's memory controller is getting hammered by all the NUMA nodes (outlier
> mbm_total)
>
> I've not heard of a platform with both memory bandwidth monitors at L3 and the memory
> controller, so this may be a theoretical issue.
>
> Shall we only expose one of mbm-local/total to prevent this being seen by user-space?

I believe in the current software design, MPAM is only able to support
mbm_total, as an individual MSC (or class of MSCs with the same
configuration) can't separate traffic by destination, so it must be
the combined value. On a hardware design where MSCs were placed such
that one only counts local traffic and another only counts remote, the
resctrl MPAM driver would have to understand the hardware
configuration well enough to be able to produce counts following
Intel's definition of mbm_local and mbm_total.

Thanks,
-Peter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ