[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYyxAPdTFejzsE42@e134344.arm.com>
Date: Wed, 11 Feb 2026 16:40:32 +0000
From: Ben Horgan <ben.horgan@....com>
To: Reinette Chatre <reinette.chatre@...el.com>
Cc: "Moger, Babu" <bmoger@....com>, "Moger, Babu" <Babu.Moger@....com>,
"Luck, Tony" <tony.luck@...el.com>,
Drew Fustini <fustini@...nel.org>,
"corbet@....net" <corbet@....net>,
"Dave.Martin@....com" <Dave.Martin@....com>,
"james.morse@....com" <james.morse@....com>,
"tglx@...nel.org" <tglx@...nel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"bp@...en8.de" <bp@...en8.de>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"juri.lelli@...hat.com" <juri.lelli@...hat.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"dietmar.eggemann@....com" <dietmar.eggemann@....com>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"bsegall@...gle.com" <bsegall@...gle.com>,
"mgorman@...e.de" <mgorman@...e.de>,
"vschneid@...hat.com" <vschneid@...hat.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"pawan.kumar.gupta@...ux.intel.com" <pawan.kumar.gupta@...ux.intel.com>,
"pmladek@...e.com" <pmladek@...e.com>,
"feng.tang@...ux.alibaba.com" <feng.tang@...ux.alibaba.com>,
"kees@...nel.org" <kees@...nel.org>,
"arnd@...db.de" <arnd@...db.de>,
"fvdl@...gle.com" <fvdl@...gle.com>,
"lirongqing@...du.com" <lirongqing@...du.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"seanjc@...gle.com" <seanjc@...gle.com>,
"xin@...or.com" <xin@...or.com>,
"Shukla, Manali" <Manali.Shukla@....com>,
"dapeng1.mi@...ux.intel.com" <dapeng1.mi@...ux.intel.com>,
"chang.seok.bae@...el.com" <chang.seok.bae@...el.com>,
"Limonciello, Mario" <Mario.Limonciello@....com>,
"naveen@...nel.org" <naveen@...nel.org>,
"elena.reshetova@...el.com" <elena.reshetova@...el.com>,
"Lendacky, Thomas" <Thomas.Lendacky@....com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"peternewman@...gle.com" <peternewman@...gle.com>,
"eranian@...gle.com" <eranian@...gle.com>,
"Shenoy, Gautham Ranjal" <gautham.shenoy@....com>
Subject: Re: [RFC PATCH 13/19] x86/resctrl: Add PLZA state tracking and
context switch handling
Hi,
Thanks for including me.
On Tue, Feb 10, 2026 at 10:04:48AM -0800, Reinette Chatre wrote:
> +Ben and Drew
>
> On 2/10/26 8:17 AM, Reinette Chatre wrote:
> > Hi Babu,
> >
> > On 1/28/26 9:44 AM, Moger, Babu wrote:
> >>
> >>
> >> On 1/28/2026 11:41 AM, Moger, Babu wrote:
> >>>> On Wed, Jan 28, 2026 at 10:01:39AM -0600, Moger, Babu wrote:
> >>>>> On 1/27/2026 4:30 PM, Luck, Tony wrote:
> >>>> Babu,
> >>>>
> >>>> I've read a bit more of the code now and I think I understand more.
> >>>>
> >>>> Some useful additions to your explanation.
> >>>>
> >>>> 1) Only one CTRL group can be marked as PLZA
> >>>
> >>> Yes. Correct.
> >
> > Why limit it to one CTRL_MON group and why not support it for MON groups?
> >
> > Limiting it to a single CTRL group seems restrictive in a few ways:
> > 1) It requires that the "PLZA" group has a dedicated CLOSID. This reduces the
> > number of use cases that can be supported. Consider, for example, an existing
> > "high priority" resource group and a "low priority" resource group. The user may
> > just want to let the tasks in the "low priority" resource group run as "high priority"
> > when in CPL0. This of course may depend on what resources are allocated, for example
> > cache may need more care, but if, for example, user is only interested in memory
> > bandwidth allocation this seems a reasonable use case?
> > 2) Similar to what Tony [1] mentioned this does not enable what the hardware is
> > capable of in terms of number of different control groups/CLOSID that can be
> > assigned to MSR_IA32_PQR_PLZA_ASSOC. Why limit PLZA to one CLOSID?
> > 3) The feature seems to support RMID in MSR_IA32_PQR_PLZA_ASSOC similar to
> > MSR_IA32_PQR_ASSOC. With this, it should be possible for user space to, for
> > example, create a resource group that contains tasks of interest and create
> > a monitor group within it that monitors all tasks' bandwidth usage when in CPL0.
> > This will give user space better insight into system behavior and from what I can
> > tell is supported by the feature but not enabled?
> >
> >>>
> >>>> 2) It can't be the root/default group
> >>>
> >>> This is something I added to keep the default group in a un-disturbed,
> >
> > Why was this needed?
> >
> >>>
> >>>> 3) It can't have sub monitor groups
> >
> > Why not?
> >
> >>>> 4) It can't be pseudo-locked
> >>>
> >>> Yes.
> >>>
> >>>>
> >>>> Would a potential use case involve putting *all* tasks into the PLZA group? That
> >>>> would avoid any additional context switch overhead as the PLZA MSR would never
> >>>> need to change.
> >>>
> >>> Yes. That can be one use case.
> >>>
> >>>>
> >>>> If that is the case, maybe for the PLZA group we should allow user to
> >>>> do:
> >>>>
> >>>> # echo '*' > tasks
> >
> > Dedicating a resource group to "PLZA" seems restrictive while also adding many
> > complications since this designation makes resource group behave differently and
> > thus the files need to get extra "treatments" to handle this "PLZA" designation.
> >
> > I am wondering if it will not be simpler to introduce just one new file, for example
> > "tasks_cpl0" in both CTRL_MON and MON groups. When user space writes a task ID to the
> > file it "enables" PLZA for this task and that group's CLOSID and RMID is the associated
> > task's "PLZA" CLOSID and RMID. This gives user space the flexibility to use the same
> > resource group to manage user space and kernel space allocations while also supporting
> > various monitoring use cases. This still supports the "dedicate a resource group to PLZA"
> > use case where user space can create a new resource group with certain allocations but the
> > "tasks" file will be empty and "tasks_cpl0" contains the tasks needing to run with
> > the resource group's allocations when in CPL0.
If there is a "tasks_cpl0" then I'd expect a "cpus_cpl0" too.
>
> It looks like MPAM has a few more capabilities here and the Arm levels are numbered differently
> with EL0 meaning user space. We should thus aim to keep things as generic as possible. For example,
> instead of CPL0 using something like "kernel" or ... ?
Yes, PLZA does open up more possibilities for MPAM usage. I've talked to James
internally and here are a few thoughts.
If the user case is just that an option run all tasks with the same closid/rmid
(partid/pmg) configuration when they are running in the kernel then I'd favour a
mount option. The resctrl filesytem interface doesn't need to change and
userspace software doesn't need to change. This could either take away a
closid/rmid from userspace and dedicate it to the kernel or perhaps have a
policy to have the default group as the kernel group. If you use the default
configuration, at least for MPAM, the kernel may not be running at the highest
priority as a minimum bandwidth can be used to give a priority boost. (Once we
have a resctrl schema for this.)
It could be useful to have something a bit more featureful though. Is there a
need for the two mappings, task->cpl0 config and task->cpl1 to be independent or
would as task->(cp0 config, cp1 config) be sufficient? It seems awkward that
it's not a single write to move a task. If a single mapping is sufficient, then
as single new file, kernel_group,per CTRL_MON group (maybe MON groups) as
suggested above but rather than a task that file could hold a path to the
CTRL_MON/MON group that provides the kernel configuraion for tasks running in
that group. So that this can be transparent to existing software an empty string
can mean use the current group's when in the kernel (as well as for
userspace). A slash, /, could be used to refer to the default group. This would
give something like the below under /sys/fs/resctrl.
.
├── cpus
├── tasks
├── ctrl1
│ ├── cpus
│ ├── kernel_group -> mon_groups/mon1
│ └── tasks
├── kernel_group -> ctrl1
└── mon_groups
└── mon1
├── cpus
├── kernel_group -> ctrl1
└── tasks
>
> I have not read anything about the RISC-V side of this yet.
>
> Reinette
>
> >
> > Reinette
> >
> > [1] https://lore.kernel.org/lkml/aXpgragcLS2L8ROe@agluck-desk3/
>
Thanks,
Ben
Powered by blists - more mailing lists