[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyzxbhHQptbktfGH@agluck-desk3>
Date: Thu, 7 Nov 2024 08:57:18 -0800
From: Tony Luck <tony.luck@...el.com>
To: Peter Newman <peternewman@...gle.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>, fenghua.yu@...el.com,
babu.moger@....com, bp@...en8.de, dave.hansen@...ux.intel.com,
eranian@...gle.com, hpa@...or.com, james.morse@....com,
linux-kernel@...r.kernel.org, mingo@...hat.com, nert.pinx@...il.com,
tan.shaopeng@...itsu.com, tglx@...utronix.de, x86@...nel.org
Subject: Re: [PATCH v2 2/2] x86/resctrl: Don't workqueue local event counter
reads
On Thu, Nov 07, 2024 at 03:26:11PM +0100, Peter Newman wrote:
> On Thu, Nov 7, 2024 at 12:01 PM Peter Newman <peternewman@...gle.com> wrote:
> >
> > Hi Reinette,
> >
> > On Thu, Nov 7, 2024 at 2:10 AM Reinette Chatre <reinette.chatre@...el.com> wrote:
>
> > > This sounds as though user space is essentially duplicating what the
> > > MBM overflow handler currently does, which is to run a worker in each domain
> > > to collect MBM data every second from every RMID for both MBM events.
> > >
> > > * What are the requirements of this use case?
> >
> > Accurate, per-RMID MBps data, ideally at 1-second resolution if the
> > overhead can be tolerable.
>
> Sorry, forgot about the assignable counters issue...
>
> On AMD we'll have to cycle the available event counters through the
> groups in order to get valid bandwidth counts.
See below.
> > > For example,
> > > # cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_00
> > > <rdtgroup nameA> <MBM total count>
> > > <rdtgroup nameB> <MBM total count>
> > > ...
> > >
> > > # cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_01
> > > <rdtgroup nameA> <MBM total count>
> > > <rdtgroup nameB> <MBM total count>
> > > ...
How about:
# cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_00
<rdtgroup nameA> <MBM total count> <timestamp> <generation>
<rdtgroup nameB> <MBM total count> <timestamp> <generation>
...
> > >
# cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_01
<rdtgroup nameA> <MBM total count> <timestamp> <generation>
<rdtgroup nameB> <MBM total count> <timestamp> <generation>
...
Where <timestamp> tracks when this sample was captured. And
<generation> is an integer that is incremented when data
for this event is lost (e.g. due to ABMC counter re-assignment).
Then a monitor application can compute bandwidth for each
group by periodic sampling and for each group:
if (thisgeneration == lastgeneration) {
bw = (thiscount - lastcount) / (thistimestanp - lasttimestamp);
-Tony
Powered by blists - more mailing lists