[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCiYFKeASPMDwzzaHLw4JiMtBB6DTyVPgt0Voe3c3Tav_A@mail.gmail.com>
Date: Thu, 2 May 2024 17:57:58 -0700
From: Peter Newman <peternewman@...gle.com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: babu.moger@....com, corbet@....net, fenghua.yu@...el.com,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
paulmck@...nel.org, rdunlap@...radead.org, tj@...nel.org,
peterz@...radead.org, yanjiewtw@...il.com, kim.phillips@....com,
lukas.bulwahn@...il.com, seanjc@...gle.com, jmattson@...gle.com,
leitao@...ian.org, jpoimboe@...nel.org, rick.p.edgecombe@...el.com,
kirill.shutemov@...ux.intel.com, jithu.joseph@...el.com, kai.huang@...el.com,
kan.liang@...ux.intel.com, daniel.sneddon@...ux.intel.com,
pbonzini@...hat.com, sandipan.das@....com, ilpo.jarvinen@...ux.intel.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [RFC PATCH v3 00/17] x86/resctrl : Support AMD Assignable
Bandwidth Monitoring Counters (ABMC)
Hi Reinette,
On Thu, May 2, 2024 at 4:21 PM Reinette Chatre
<reinette.chatre@...el.com> wrote:
>
> Hi Peter and Babu,
>
> On 5/2/2024 1:14 PM, Moger, Babu wrote:
> > Are you suggesting to enable ABMC by default when available?
>
> I do think ABMC should be enabled by default when available and it looks
> to be what this series aims to do [1]. The way I reason about this is
> that legacy user space gets more reliable monitoring behavior without
> needing to change behavior.
I don't like that for a monitor assignment-aware user, following the
creation of new monitoring groups, there will be less monitors
available for assignment. If the user wants precise control over where
monitors are allocated, they would need to manually unassign the
automatically-assigned monitor after creating new groups.
It's an annoyance, but I'm not sure if it would break any realistic
usage model. Maybe if the monitoring agent operates independently of
whoever creates monitoring groups it could result in brief periods
where less monitors than expected are available because whoever just
created a new monitoring group hasn't given the automatically-assigned
monitors back yet.
>
> I thought there was discussion about communicating to user space
> when an attempt is made to read data from an event that does not
> have a counter assigned. Something like below but I did not notice this
> in this series.
>
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> Unassigned
>
> >
> > Then provide the mount option switch back to legacy mode?
> > I am fine with that if we all agree on that.
>
> Why is a mount option needed? I think we should avoid requiring a remount
> unless required and I would like to understand why it is required here.
>
> Peter: could you please elaborate what you mean with it makes it more
> difficult for the FS code to generically manage monitor assignment?
>
> Why would user space be required to recreate all control and monitor
> groups if wanting to change how memory bandwidth monitoring is done?
I was looking at this more from the perspective of whether it's
necessary to support the live transition of the groups' configuration
back and forth between programming models. I find it very unlikely
for the userspace controller software to change its mind about the
programming model for monitoring in a running system, so I thought
this would be in the same category as choosing at mount time whether
or not to use CDP or the MBA software controller.
Also, in the software implementation of monitor assignment for older
AMD processors, which is based on allocating a subset of RMIDs, I'm
concerned that the context switch handler would want to read the
monitors associated with the incoming thread's current group to
determine whether it should use one of the tracked RMIDs. I believe it
would be cleaner if the lifetime of the generic monitor-tracking
structures would last until the static branches gating
__resctrl_sched_in() could be disabled.
>
> From this implementation it has been difficult to understand the impact
> of switching between ABMC and legacy.
I'll see if there's a good way to share my software monitor assignment
prototype so it's clearer how the user interface would interact with
diverse implementations. Unfortunately, it's difficult to see the
required abstraction boundaries without the fs/resctrl refactoring
changes[1] applied. It would also require my changes[2] for reading a
thread's RMID from the FS structures to prevent monitor assignments
from forcing an update of all task_structs in the system.
-Peter
[1] https://lore.kernel.org/lkml/20240426150537.8094-1-Dave.Martin@arm.com/
[2] https://lore.kernel.org/lkml/20240325172707.73966-1-peternewman@google.com/
Powered by blists - more mailing lists