lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyzxbhHQptbktfGH@agluck-desk3>
Date: Thu, 7 Nov 2024 08:57:18 -0800
From: Tony Luck <tony.luck@...el.com>
To: Peter Newman <peternewman@...gle.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>, fenghua.yu@...el.com,
	babu.moger@....com, bp@...en8.de, dave.hansen@...ux.intel.com,
	eranian@...gle.com, hpa@...or.com, james.morse@....com,
	linux-kernel@...r.kernel.org, mingo@...hat.com, nert.pinx@...il.com,
	tan.shaopeng@...itsu.com, tglx@...utronix.de, x86@...nel.org
Subject: Re: [PATCH v2 2/2] x86/resctrl: Don't workqueue local event counter
 reads

On Thu, Nov 07, 2024 at 03:26:11PM +0100, Peter Newman wrote:
> On Thu, Nov 7, 2024 at 12:01 PM Peter Newman <peternewman@...gle.com> wrote:
> >
> > Hi Reinette,
> >
> > On Thu, Nov 7, 2024 at 2:10 AM Reinette Chatre <reinette.chatre@...el.com> wrote:
> 
> > > This sounds as though user space is essentially duplicating what the
> > > MBM overflow handler currently does, which is to run a worker in each domain
> > > to collect MBM data every second from every RMID for both MBM events.
> > >
> > > * What are the requirements of this use case?
> >
> > Accurate, per-RMID MBps data, ideally at 1-second resolution if the
> > overhead can be tolerable.
> 
> Sorry, forgot about the assignable counters issue...
> 
> On AMD we'll have to cycle the available event counters through the
> groups in order to get valid bandwidth counts.

See below.

> > > For example,
> > >         # cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_00
> > >           <rdtgroup nameA> <MBM total count>
> > >           <rdtgroup nameB> <MBM total count>
> > >           ...
> > >
> > >         # cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_01
> > >           <rdtgroup nameA> <MBM total count>
> > >           <rdtgroup nameB> <MBM total count>
> > >           ...

How about:

# cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_00
<rdtgroup nameA> <MBM total count> <timestamp> <generation>
<rdtgroup nameB> <MBM total count> <timestamp> <generation>
...
> > >
# cat /sys/fs/resctrl/info/L3_MON/mbm_snapshot/mbm_total_bytes_01
<rdtgroup nameA> <MBM total count> <timestamp> <generation>
<rdtgroup nameB> <MBM total count> <timestamp> <generation>
...

Where <timestamp> tracks when this sample was captured. And
<generation> is an integer that is incremented when data
for this event is lost (e.g. due to ABMC counter re-assignment).

Then a monitor application can compute bandwidth for each
group by periodic sampling and for each group:

	if (thisgeneration == lastgeneration) {
		bw = (thiscount - lastcount) / (thistimestanp - lasttimestamp);

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ