[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871pl9krdz.fsf@stealth>
Date: Fri, 05 Dec 2025 13:08:40 +0000
From: Punit Agrawal <punit.agrawal@....qualcomm.com>
To: James Morse <james.morse@....com>
Cc: Punit Agrawal <punit.agrawal@....qualcomm.com>,
Ben Horgan
<ben.horgan@....com>, amitsinght@...vell.com,
baisheng.gao@...soc.com, baolin.wang@...ux.alibaba.com,
bobo.shaobowang@...wei.com, carl@...amperecomputing.com,
catalin.marinas@....com, dakr@...nel.org, dave.martin@....com,
david@...hat.com, dfustini@...libre.com, fenghuay@...dia.com,
gregkh@...uxfoundation.org, gshan@...hat.com, guohanjun@...wei.com,
jeremy.linton@....com, jonathan.cameron@...wei.com, kobak@...dia.com,
lcherian@...vell.com, lenb@...nel.org, linux-acpi@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
lpieralisi@...nel.org, peternewman@...gle.com, quic_jiles@...cinc.com,
rafael@...nel.org, robh@...nel.org, rohit.mathew@....com,
scott@...amperecomputing.com, sdonthineni@...dia.com,
sudeep.holla@....com, tan.shaopeng@...itsu.com, will@...nel.org,
xhao@...ux.alibaba.com, reinette.chatre@...el.com
Subject: Re: [PATCH v6 00/34] arm_mpam: Add basic mpam driver
James Morse <james.morse@....com> writes:
> Hi Punit,
>
> On 03/12/2025 11:21, Punit Agrawal wrote:
>> Ben Horgan <ben.horgan@....com> writes:
>>> On 11/24/25 15:21, Punit Agrawal wrote:
>>>> Although a little late to the party,
>
> There was a party?!
>
>
>>>> I've managed to throw together
>>>> enough firmware to describe the MPAM hardware and take this set (more
>>>> specifically mpam/snapshot/v6.18-rc4-v5 branch from James' repository)
>>>> for a spin. Using the branch, the kernel is able to probe the hardware
>>>> and discover the advertised features. Yay! We are in business.
>>>
>>> Thanks for giving it a go. :)
>>>
>>>>
>>>> Having said that, there are a few quirks of the platform that run into
>>>> issues with later patches in the branch.
>
> So something in the resctrl support code is causing this.
> Any idea which patch causes this to happen?
>
> There are a load of pr_debug() in the picking logic, if you enable DYNDEBUG and add:
> | dyndbg="file mpam_resctrl.c +pl"
>
> to the commandline, you should get some snotty messages about what non-Xeon-like property
> your platform has.
Thanks - I've got this enabled.
The platform looks very different to a Xeon. One notable difference
being a shared L2. Hence all the MSCs attached there.
>>>> The platform has MSCs attached
>>>> to shared L2 caches which are being skipped during later stages of
>>>> initialisation. IIUC, the L2 MSCs' limitations stems from the
>>>> assumptions in the resctrl interface.
>>>
>>> What in particualar is being skipped?
>
>> The registration of the discovered MSCs with resctrl and subsequent
>> exposing it to the user.
>
> resctrl's 'L2' support is limited to the CPOR bitmap.
> If you have controls, there is no resctrl 'event' that can exposed them.
> (the problem being they all have 'L3' in the name!)
>>>> I was wondering if there are any patches available to relax these
>>>> limitations?
> Knowing which property it is will help - but some of these things are checked
> to match resctrl's ABI. They can't necessarily be relaxed without breaking
> user-space.
This platform has portion, capacity and priority partitioning, as well
as memory bandwidth and cache storage monitoring. The MPAM code seems to
correctly parse the properties.
But as you point out, the resctrl 'L2' support doesn't have anything
other than CPOR bitmap yet. Have you looked at what's needed to extend
resctrl to support some of the others?
> Others are sanity checks, e.g. all CPUs are represented. This is to avoid tasks
> that run on cpu-9 escaping the resctrl controls. Platforms that did this may as
> well not bother with resctrl at all.
>
>
>>>> I can give them a try. Or do these need to be put together
>>>> from the ground up? Any pointers greatly appreciated.
>>>
>>> There are some extra things added in the extras branch [1] e.g. cache
>>> maximum usage controls (cmax). However, lots of possible things are
>>> still missing e.g. any monitors on L2. If it doesn't fit with the
>>> topology expected by resctrl then it is unlikely to have been considered
>>> yet.
>>
>> Thanks for the pointer. I'll give the snapshot+extras branch[1] a try.
>>
>> The platform does have both controls and monitors attached to L2. If
>> this isn't being looked at, I can try and put something together. Thanks
>> for confirming that the limitation is likely due to resctrl.
>
> My view on 'extra' counters is to try and expose them via perf, as this would also
> allow platform specific counters. I worry that if we start adding 'easy' ones like
> l2_mbm_total to resctrl, someone will want
> left_hand_side_of_soc_mbm_total.
I wouldn't club L2 in the same category as 'left_hand_side_of_soc'. You
call it 'easy' for a reason. L2 is pretty well understood and resctrl
already exposes an interface for it. I would avoid creating a new
interface for users.
For some of the other boundaries, things like 'left_hand_side_of_soc' I
wonder if firmware provided topology (e.g., PPTT, SRAT, etc) could be
used to make even this work.
Powered by blists - more mailing lists