[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c157d2b6-7282-438f-b9e7-1a24be3fbf53@arm.com>
Date: Mon, 19 Jan 2026 12:04:10 +0000
From: James Morse <james.morse@....com>
To: Peter Newman <peternewman@...gle.com>, Ben Horgan <ben.horgan@....com>
Cc: amitsinght@...vell.com, baisheng.gao@...soc.com,
baolin.wang@...ux.alibaba.com, carl@...amperecomputing.com,
dave.martin@....com, david@...nel.org, dfustini@...libre.com,
fenghuay@...dia.com, gshan@...hat.com, jonathan.cameron@...wei.com,
kobak@...dia.com, lcherian@...vell.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
punit.agrawal@....qualcomm.com, quic_jiles@...cinc.com,
reinette.chatre@...el.com, rohit.mathew@....com,
scott@...amperecomputing.com, sdonthineni@...dia.com,
tan.shaopeng@...itsu.com, xhao@...ux.alibaba.com, catalin.marinas@....com,
will@...nel.org, corbet@....net, maz@...nel.org, oupton@...nel.org,
joey.gouly@....com, suzuki.poulose@....com, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v3 29/47] arm_mpam: resctrl: Pick classes for use as mbm
counters
Hi Peter,
On 15/01/2026 15:49, Peter Newman wrote:
> On Mon, Jan 12, 2026 at 6:02 PM Ben Horgan <ben.horgan@....com> wrote:
>> From: James Morse <james.morse@....com>
>>
>> resctrl has two types of counters, NUMA-local and global. MPAM has only
>> bandwidth counters, but the position of the MSC may mean it counts
>> NUMA-local, or global traffic.
>>
>> But the topology information is not available.
>>
>> Apply a heuristic: the L2 or L3 supports bandwidth monitors, these are
>> probably NUMA-local. If the memory controller supports bandwidth monitors,
>> they are probably global.
> Are remote memory accesses not cached? How do we know an MBWU monitor
> residing on a cache won't count remote traffic?
It will, yes you get double counting. Is forbidding both mbm_total and mbm_local preferable?
I think this comes from 'total' in mbm_total not really having the obvious meaning of the
word:
If I have CPUs in NUMA-A and no memory controllers, then NUMA-B has no CPUs, and all the
memory-controllers.
With MPAM: we've only got one bandwidth counter, it doesn't know where the traffic goes
after the MSC. mbm-local on the L3 would reflect all the bandwidth, and mbm-total on the
memory-controllers would have the same number.
I think on x86 mbm_local on the CPUs would read zero as zero traffic went to the 'local'
memory controller, and mbm_total would reflect all the memory bandwidth. (so 'total'
really means 'other')
I think what MPAM is doing here is still useful as a system normally has both CPUs and
memory controllers in the NUMA nodes, and you can use this to spot a control/monitor group
on a NUMA-node that is hammering all the memory (outlier mbm_local), or the same where a
NUMA-node's memory controller is getting hammered by all the NUMA nodes (outlier
mbm_total)
I've not heard of a platform with both memory bandwidth monitors at L3 and the memory
controller, so this may be a theoretical issue.
Shall we only expose one of mbm-local/total to prevent this being seen by user-space?
Thanks,
James
Powered by blists - more mailing lists