[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb435a64-70d4-4821-908d-686243fec7a6@intel.com>
Date: Thu, 20 Feb 2025 10:36:18 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Dave Martin <Dave.Martin@....com>
CC: Peter Newman <peternewman@...gle.com>, "Moger, Babu" <bmoger@....com>,
Babu Moger <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>,
<tony.luck@...el.com>, <x86@...nel.org>, <hpa@...or.com>,
<paulmck@...nel.org>, <akpm@...ux-foundation.org>, <thuth@...hat.com>,
<rostedt@...dmis.org>, <xiongwei.song@...driver.com>,
<pawan.kumar.gupta@...ux.intel.com>, <daniel.sneddon@...ux.intel.com>,
<jpoimboe@...nel.org>, <perry.yuan@....com>, <sandipan.das@....com>,
<kai.huang@...el.com>, <xiaoyao.li@...el.com>, <seanjc@...gle.com>,
<xin3.li@...el.com>, <andrew.cooper3@...rix.com>, <ebiggers@...gle.com>,
<mario.limonciello@....com>, <james.morse@....com>,
<tan.shaopeng@...itsu.com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <maciej.wieczor-retman@...el.com>,
<eranian@...gle.com>
Subject: Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Dave,
On 2/20/25 9:46 AM, Dave Martin wrote:
> Hi again,
>
> On Thu, Feb 20, 2025 at 04:46:40PM +0000, Dave Martin wrote:
>> Hi,
>>
>> On Wed, Feb 19, 2025 at 09:56:29AM -0800, Reinette Chatre wrote:
>>> Hi Peter,
>>>
>>> On 2/19/25 3:28 AM, Peter Newman wrote:
>>
>> [...]
>>
>>>> In the letters as events model, choosing the events assigned to a
>>>> group wouldn't be enough information, since we would want to control
>>>> which events should share a counter and which should be counted by
>>>> separate counters. I think the amount of information that would need
>>>> to be encoded into mbm_assign_control to represent the level of
>>>> configurability supported by hardware would quickly get out of hand.
>>>>
>>>> Maybe as an example, one counter for all reads, one counter for all
>>>> writes in ABMC would look like...
>>>>
>>>> (L3_QOS_ABMC_CFG.BwType field names below)
>>>>
>>>> (per domain)
>>>> group 0:
>>>> counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill
>>>> counter 1: VictimBW,LclNTWr,RmtNTWr
>>>> group 1:
>>>> counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill
>>>> counter 3: VictimBW,LclNTWr,RmtNTWr
>>>> ...
>>>>
>>>
>>> I think this may also be what Dave was heading towards in [2] but in that
>>> example and above the counter configuration appears to be global. You do mention
>>> "configurability supported by hardware" so I wonder if per-domain counter
>>> configuration is a requirement?
>>>
>>> Until now I viewed counter configuration separate from counter assignment,
>>> similar to how AMD's counters can be configured via mbm_total_bytes_config and
>>> mbm_local_bytes_config before they are assigned. That is still per-domain
>>> counter configuration though, not per-counter.
>>
>> I hadn't tried to work the design through in any detail: it wasn't
>> intended as a suggestion for something we should definitely do right
>> now; rather, it was just an incomplete sketch of one possible future
>> evolution of the interface.
>>
>> Either way these feel like future concerns, if the first iteration of
>> ABMC is just to provide the basics so that ABMC hardware can implement
>> resctrl without userspace seeing counters randomly stopping and
>> resetting...
>>
>> Peter, can you give a view on whether the ABMC as proposed in this series
>> is a useful stepping-stone? Or are there things that you need that you
>> feel could not be added as a later extension without ABI breakage?
>>
>> [...]
>>
>>>> I believe that shared assignments will take care of all the
>>>> high-frequency and performance-intensive batch configuration updates I
>>>> was originally concerned about, so I no longer see much benefit in
>>>> finding ways to textually encode all this information in a single file
> jjjk> > > when it would be more manageable to distribute it around the
>>>> filesystem hierarchy.
>>>
>>> This is significant. The motivation for the single file was to support
>>> the "high-frequency and performance-intensive" usage. Would "shared assignments"
>>> not also depend on the same files that, if distributed, will require many
>>> filesystem operations?
>>> Having the files distributed will be significantly simpler while also
>>> avoiding the file size issue that Dave Martin exposed.
>>>
>>> Reinette
>>
>> I still haven't fully understood the "shared assignments" proposal;
>> I need to go back and look at it.
>
> Having taken a quick look at that now, this all seems to duplicate
> perf's design journey (again).
>
> "rate" events make some sense. The perf equivalent is to keep an
> accumulated count of the amount of time a counter has been assigned to
> an event, and another accumulated count of the events counted by the
> counter during assignment. Only userspace knows what it wants to do
> with this information: perf exposes the raw accumulated counts.
>
> Perf events can be also pinned so that they are prioritised for
> assignment to counters; that sounds a lot like the regular, non-shared
> resctrl counters.
>
>
> Playing devil's advocate:
>
> It does feel like we are doomed to reinvent perf if we go too far down
> this road...
>
>> If we split the file, it will be more closely aligned with the design
>> of the rest of the resctrlfs interface.
>>
>> OTOH, the current interface seems workable and I think the file size
>> issue can be addressed without major re-engineering.
>>
>> So, from my side, I would not consider the current interface design
>> a blocker.
>
> ...so, drawing a hard line around the use cases that we intend to
> address with this interface and avoiding feature creep seems desirable.
This is exactly what I am trying to do ... to understand what use cases
the interface is expected to support.
You have mentioned a couple of times now that this interface is sufficient but
at the same time you hinted at some features from MPAM that I do not see
possible to accommodate with this interface.
> resctrlfs is already in the wild, so providing reasonable baseline
> compatiblity with that interface for ABMC hardware is a sensible goal.
> The current series does that.
>
> But I wonder how much additional functionality we should really be
> adding via the mbm_assign_control interface, once this series is
> settled.
Are you speculating that MPAM counters may not make use of this interface?
Reinette
Powered by blists - more mailing lists