lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7dqXlOMsw7Kb8F2@e133380.arm.com>
Date: Thu, 20 Feb 2025 17:46:06 +0000
From: Dave Martin <Dave.Martin@....com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: Peter Newman <peternewman@...gle.com>, "Moger, Babu" <bmoger@....com>,
	Babu Moger <babu.moger@....com>, corbet@....net, tglx@...utronix.de,
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
	tony.luck@...el.com, x86@...nel.org, hpa@...or.com,
	paulmck@...nel.org, akpm@...ux-foundation.org, thuth@...hat.com,
	rostedt@...dmis.org, xiongwei.song@...driver.com,
	pawan.kumar.gupta@...ux.intel.com, daniel.sneddon@...ux.intel.com,
	jpoimboe@...nel.org, perry.yuan@....com, sandipan.das@....com,
	kai.huang@...el.com, xiaoyao.li@...el.com, seanjc@...gle.com,
	xin3.li@...el.com, andrew.cooper3@...rix.com, ebiggers@...gle.com,
	mario.limonciello@....com, james.morse@....com,
	tan.shaopeng@...itsu.com, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
	eranian@...gle.com
Subject: Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth
 Monitoring Counters (ABMC)

Hi again,

On Thu, Feb 20, 2025 at 04:46:40PM +0000, Dave Martin wrote:
> Hi,
> 
> On Wed, Feb 19, 2025 at 09:56:29AM -0800, Reinette Chatre wrote:
> > Hi Peter,
> > 
> > On 2/19/25 3:28 AM, Peter Newman wrote:
> 
> [...]
> 
> > > In the letters as events model, choosing the events assigned to a
> > > group wouldn't be enough information, since we would want to control
> > > which events should share a counter and which should be counted by
> > > separate counters. I think the amount of information that would need
> > > to be encoded into mbm_assign_control to represent the level of
> > > configurability supported by hardware would quickly get out of hand.
> > > 
> > > Maybe as an example, one counter for all reads, one counter for all
> > > writes in ABMC would look like...
> > > 
> > > (L3_QOS_ABMC_CFG.BwType field names below)
> > > 
> > > (per domain)
> > > group 0:
> > >  counter 0: LclFill,RmtFill,LclSlowFill,RmtSlowFill
> > >  counter 1: VictimBW,LclNTWr,RmtNTWr
> > > group 1:
> > >  counter 2: LclFill,RmtFill,LclSlowFill,RmtSlowFill
> > >  counter 3: VictimBW,LclNTWr,RmtNTWr
> > > ...
> > > 
> > 
> > I think this may also be what Dave was heading towards in [2] but in that
> > example and above the counter configuration appears to be global. You do mention
> > "configurability supported by hardware" so I wonder if per-domain counter
> > configuration is a requirement?
> > 
> > Until now I viewed counter configuration separate from counter assignment,
> > similar to how AMD's counters can be configured via mbm_total_bytes_config and
> > mbm_local_bytes_config before they are assigned. That is still per-domain
> > counter configuration though, not per-counter.
> 
> I hadn't tried to work the design through in any detail: it wasn't
> intended as a suggestion for something we should definitely do right
> now; rather, it was just an incomplete sketch of one possible future
> evolution of the interface.
> 
> Either way these feel like future concerns, if the first iteration of
> ABMC is just to provide the basics so that ABMC hardware can implement
> resctrl without userspace seeing counters randomly stopping and
> resetting...
> 
> Peter, can you give a view on whether the ABMC as proposed in this series
> is a useful stepping-stone?  Or are there things that you need that you
> feel could not be added as a later extension without ABI breakage?
> 
> [...]
> 
> > > I believe that shared assignments will take care of all the
> > > high-frequency and performance-intensive batch configuration updates I
> > > was originally concerned about, so I no longer see much benefit in
> > > finding ways to textually encode all this information in a single file
jjjk> > > when it would be more manageable to distribute it around the
> > > filesystem hierarchy.
> > 
> > This is significant. The motivation for the single file was to support
> > the "high-frequency and performance-intensive" usage. Would "shared assignments"
> > not also depend on the same files that, if distributed, will require many
> > filesystem operations? 
> > Having the files distributed will be significantly simpler while also
> > avoiding the file size issue that Dave Martin exposed. 
> > 
> > Reinette
> 
> I still haven't fully understood the "shared assignments" proposal;
> I need to go back and look at it.

Having taken a quick look at that now, this all seems to duplicate
perf's design journey (again).

"rate" events make some sense.  The perf equivalent is to keep an
accumulated count of the amount of time a counter has been assigned to
an event, and another accumulated count of the events counted by the
counter during assignment.  Only userspace knows what it wants to do
with this information: perf exposes the raw accumulated counts.

Perf events can be also pinned so that they are prioritised for
assignment to counters; that sounds a lot like the regular, non-shared
resctrl counters.


Playing devil's advocate:

It does feel like we are doomed to reinvent perf if we go too far down
this road...

> If we split the file, it will be more closely aligned with the design
> of the rest of the resctrlfs interface.
> 
> OTOH, the current interface seems workable and I think the file size
> issue can be addressed without major re-engineering.
> 
> So, from my side, I would not consider the current interface design
> a blocker.

...so, drawing a hard line around the use cases that we intend to
address with this interface and avoiding feature creep seems desirable.

resctrlfs is already in the wild, so providing reasonable baseline
compatiblity with that interface for ABMC hardware is a sensible goal.
The current series does that.

But I wonder how much additional functionality we should really be
adding via the mbm_assign_control interface, once this series is
settled.

Cheers
---Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ