[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCj438UfH3QA_VnGo-pj2a_48sJufUWjBKT3MQatcMJ_Uw@mail.gmail.com>
Date: Thu, 22 May 2025 11:14:38 +0200
From: Peter Newman <peternewman@...gle.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: "Moger, Babu" <bmoger@....com>, babu.moger@....com, corbet@....net, tony.luck@...el.com,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, james.morse@....com, dave.martin@....com,
fenghuay@...dia.com, x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
ardb@...nel.org, gregkh@...uxfoundation.org, daniel.sneddon@...ux.intel.com,
jpoimboe@...nel.org, alexandre.chartre@...cle.com,
pawan.kumar.gupta@...ux.intel.com, thomas.lendacky@....com,
perry.yuan@....com, seanjc@...gle.com, kai.huang@...el.com,
xiaoyao.li@...el.com, kan.liang@...ux.intel.com, xin3.li@...el.com,
ebiggers@...gle.com, xin@...or.com, sohil.mehta@...el.com,
andrew.cooper3@...rix.com, mario.limonciello@....com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
maciej.wieczor-retman@...el.com, eranian@...gle.com, Xiaojian.Du@....com,
gautham.shenoy@....com
Subject: Re: [PATCH v13 00/27] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Reinette,
On Thu, May 22, 2025 at 1:05 AM Reinette Chatre
<reinette.chatre@...el.com> wrote:
>
> Hi Peter,
>
> On 5/21/25 7:27 AM, Peter Newman wrote:
> > On Wed, May 21, 2025 at 1:44 AM Reinette Chatre
> > <reinette.chatre@...el.com> wrote:
> >> On 5/20/25 4:25 PM, Moger, Babu wrote:
>
> ...
> >>>
> >>> Here’s my current understanding of soft-ABMC. Peter may have a more in-depth perspective on this.
> >>>
> >>> Soft-ABMC:
> >>> a. num_mbm_cntrs: This is a software-defined limit based on the number of active RMIDs that can be supported. The value can be obtained using the code referenced in [4].
> >>>
> >>> b. Assignments: No hardware configuration is required. We simply need to ensure that no more than num_mbm_cntrs RMIDs are active at any given time.
> >>>
> >>> c. Configuration: Controlled via /info/L3_MON/mbm_total_bytes_config and mbm_local_bytes_config.
> >>>
> >>> d. Events: Only two events can be assigned(local and total).
> >>>
> >>> ABMC:
> >>> a. num_mbm_cntrs: This is defined by the hardware.
> >>> b. Assignments: Requires special MSR writes to assign counters.
> >>> c. Configuration: Comes from /info/L3_MON/counter_configs/.
> >>> d. Events: More than two events can be assigned to a group (currently up to 2).
> >>>
> >>> Commonalities:
> >>> a. Assignments can be either exclusive or shared in both these modes.
> >>>
> >>> Given these, I believe we can easily accommodate soft-ABMC in this interface.
> >>
> >> This is not so obvious to me. It looks to me as though the user is forced to interpret
> >> the content of resctrl files differently based on soft-ABMC vs ABMC making the interface
> >> inconsistent and user thus needing to know details of implementations. This is what the previous
> >> discussion I linked to aimed to address. It sounds to me as though you believe that this is no longer
> >> an issue. Could you please show examples of what a user can expect from the interfaces and how a user
> >> will interact with the interfaces on both a non-ABMC and ABMC system?
> >
> > At the interface level, I think mbm_L3_assignments on a non-ABMC
> > system would only need to contain a single line:
> >
> > 0=s;1=s;...;31=s
>
> It should be obvious to user space how to interpret the fields. When there is
> thus a single "mbm_cntr_assign" mode used for ABMC and soft-ABMC a single
> line like this would be difficult to parse since that would imply/require
> that user space knows whether it is running on ABMC or soft-ABMC system,
> which we should avoid.
>
> If there are different modes, for example "mbm_cntr_event_assign" and
> "mbm_cntr_group_assign" then this could be used by user space to distinguish
> how to interact with mbm_L3_assignments making something like this possible.
I meant to say I was proposing the format of this file when in the
group assignment mode. I didn't mean to imply that a separate mode
wasn't needed.
>
> >
> > But maybe for consistency we would synthesize a single, unmodifiable
> > counter configuration to reflect that allocating an RMID in a domain
> > results in assignment to all events and deallocating the RMID
> > unassigns all events. We could call it "group" to say it's assigning
> > at the group level, or perhaps just '*':
> >
> > *:0=s;1=s;...;31=s
> >
> > I'm not sure about allowing a '*' on ABMC hardware, because it could
> > be interpreted as allocating a lot of counters when a large number of
> > event configurations exist.
> >
> > *:0=s;1=s;...;31=s
> >
>
> Either could work also. Whether it is "group" or "*" ABMC systems could
> respond with "not supported". Will think about this more but would
> like to hear your opinion about the flexibility that distinguishing between
> a "mbm_cntr_event_assign" and "mbm_cntr_group_assign" mode provides.
I agree it's clearer when they are separate modes. Between "*" and
"group", I prefer "group" because it seems the least ambiguous.
I just want to make sure we'd never want both modes at the same time,
such as an implementation with both a small number of monitoring IDs
and a small number of MBM counters. I support one MPAM implementation
that has a small number of PARTIDs and only one MBWU counter per
domain. Fingers crossed that the number of PARTIDs it supports isn't
small compared to the number of jobs we would run on it. Otherwise
maybe it will work out to just pick the more limited of the two
(monitor IDs or counters) and make allocation of one drive the other.
(In case you read this before my earlier reply[1], see the note about
rdtgroup pointers in the task_struct, as this is a prerequisite for
overcommitting HW monitor IDs.)
Thanks,
-Peter
[1] https://lore.kernel.org/lkml/CALPaoCjh_NXQLtNBqei=7a6Jsr17fEnPO+kqMaNq4xNu2UPDJA@mail.gmail.com/
>
> Reinette
>
> > -Peter
> >
> >
> >>
> >> Thank you
> >>
> >> Reinette
> >>
> >>>
> >>>>>>
> >>>>>> [2] https://lore.kernel.org/lkml/afb99efe-0de2-f7ad-d0b8-f2a0ea998efd@amd.com/
> >>>>>> [3] https://lore.kernel.org/lkml/CALPaoCg3KpF94g2MEmfP_Ro2mQZYFA8sKVkmb+7isotKNgdY9A@mail.gmail.com/
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
Powered by blists - more mailing lists