[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCjNgv4rFpbovyayNynuAqYGP0rVLv=djVnDO1LR+zU55g@mail.gmail.com>
Date: Fri, 8 Dec 2023 11:45:54 -0800
From: Peter Newman <peternewman@...gle.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: Babu Moger <babu.moger@....com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com,
James Morse <james.morse@....com>, x86@...nel.org,
hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, seanjc@...gle.com,
kim.phillips@....com, jmattson@...gle.com,
ilpo.jarvinen@...ux.intel.com, jithu.joseph@...el.com,
kan.liang@...ux.intel.com, nikunj@....com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com,
rick.p.edgecombe@...el.com, rppt@...nel.org,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, dhagiani@....com
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature
On Tue, Dec 5, 2023 at 3:17 PM Reinette Chatre
<reinette.chatre@...el.com> wrote:
> On 11/30/2023 4:57 PM, Babu Moger wrote:
> > c. Read the monitor states. There will be new file "monitor_state"
> > for each monitor group when ABMC feature is enabled. By default,
> > both total and local MBM events are in "unassign" state.
> >
> > #cat /sys/fs/resctrl/monitor_state
> > total=unassign;local=unassign
> >
> > d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
> > events are not available until the user assigns the events explicitly.
> > Users need to assign the counters to monitor the events in this mode.
> >
> > #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> > Unavailable
>
> How is the llc_occupancy event impacted when ABMC is enabled? Can all RMIDs
> still be used to track cache occupancy?
>
> >
> > #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> > Unavailable
>
> I believe that "Unavailable" already has an accepted meaning within current
> interface and is associated with temporary failure. Even the AMD spec states "This
> is generally a temporary condition and subsequent reads may succeed". In the
> scenario above there is no chance that this counter would produce a value later.
> I do not think it is ideal to overload existing interface with different meanings
> associated with a new hardware specific feature ... something like "Disabled" seems
> more appropriate.
Could we hide event counter files if they're not enabled? Is there
value in displaying the value of a non-running counter that will be
reset the next time it's enabled?
>
> Considering this we may even consider using these files themselves as a
> way to enable the counters if they are disabled. For example, just
> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
> to enable this counter. No need for a new "monitor_state". Please note that this
> is not an official proposal since there are two other use cases that still need to
> be considered as we await James's feedback on how this may work for MPAM and
> also how this may be useful on AMD hardware that does not support ABMC but
> users may want to get similar benefits ([1])
We plan to use the ABMC counters as a window to sample the MB/s rate
of a very large number of groups, so there's a serious concern about
the number of write syscalls this will take, as they will add up
quickly for a large RMID*domain count.
To that end, the ideal would be the ability to re-assign all ABMC
counters on all domains in a single system call.
-Peter
Powered by blists - more mailing lists