[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4def722-aa98-6bf9-6e77-65a9fd9bd8ca@arm.com>
Date: Tue, 25 Oct 2022 16:55:46 +0100
From: James Morse <james.morse@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
Peter Newman <peternewman@...gle.com>
Cc: Tony Luck <tony.luck@...el.com>,
"Yu, Fenghua" <fenghua.yu@...el.com>,
"Eranian, Stephane" <eranian@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Babu Moger <Babu.Moger@....com>,
Gaurang Upasani <gupasani@...gle.com>
Subject: Re: [RFD] resctrl: reassigning a running container's CTRL_MON group
Hi Reinette, Peter,
On 20/10/2022 20:08, Reinette Chatre wrote:
> On 10/20/2022 1:48 AM, Peter Newman wrote:
>> On Thu, Oct 20, 2022 at 1:54 AM Reinette Chatre
>> <reinette.chatre@...el.com> wrote:
>>> It is still not clear to me how palatable this will be on Arm systems.
>>> This solution also involves changing the CLOSID/PARTID like your original
>>> proposal and James highlighted that it would "mess up the bandwidth counters"
>>> because of the way PARTID.PMG is used for monitoring. Perhaps even a new
>>> PMG would need to be assigned during such a monitor group move. One requirement
>>> for this RFD was to keep usage counts intact and from what I understand
>>> this will not be possible on Arm systems. There could be software mechanisms
>>> to help reduce the noise during the transition. For example, some new limbo
>>> mechanism that avoids re-assigning the old PARTID.PMG, while perhaps still
>>> using the old PARTID.PMG to read usage counts for a while? Or would the
>>> guidance just be that the counters will have some noise after the move?
>>
>> I'm going to have to follow up on the details of this in James's thread.
>> It sounded like we probably won't be able to create enough mon_groups
>> under a single control group for the rename feature to even be useful.
>> Rather, we expect the PARTID counts to be so much larger than the PMG
>> counts that creating more mon_groups to reduce the number of control
>> groups wouldn't make sense.
>>
>> At least in our use case, we're literally creating "classes of service"
>> to prioritize memory traffic, so we want a small number of control
>> groups to represent the small number of priority levels, but enough
>> RMIDs to count every job's traffic independently. For MPAM to support
>> this MBM/MBA use case in exactly this fashion, we'd have to develop the
>> monitors-not-matching-on-PARTID use case better in the MPAM
>> architecture. But before putting much effort into that, I'd want to know
>> if there's any payoff beyond being able to use resctrl the same way on
>> both implementations.
> If the expectation is that PARTID counts are very high then how about
> a solution where multiple PARTIDs are associated with the same CTRL_MON group?
> A CTRL_MON group presents a resource allocation to user space, CLOSIDs/PARTIDs
> are not exposed. So using multiple PARTIDs for a resource group (all with the
> same allocation) seems conceptually ok to me. (Please note, I did not do an
> audit to see if there are any hidden assumption or look into lifting required
> to support his.)
This would work when systems are built to look like RDT, but MPAM has other control types
where this would have interesting behaviours.
'CPOR' is equivalent to CBM as they are both a bitmap of portions. MPAM also has 'CMAX'
where a fraction of the cache is specified. If you create two control groups with
different PARTIDs but the same configuration, their two 50%s of the cache could become
100%. CPOR can be used like this, CMAX can't.
> So, if a user moves a MON group to a new CTRL_MON group, if there are no
> PARTID.PMG available in the destination CTRL_MON group to support the move
> then one of the free PARTID can be used, automatically assigned with the
> allocation of the destination CTRL_MON, and a new monitor group created using
> the new PMG range brought with the new PARTID.
This would be transparent on some hardware, but not on others. It depends what controls
are supported.
Even when the controls behave in the same way, a different PARTID with the same control
values could be regulated differently, resulting in weirdness.
> There may also be a way to guide resctrl to do something like this (use
> available PARTID) when a user creates a new MON group. This may be a way
> to address the earlier concern of how applications can decide to create
> lots of MON groups vs CTRL_MON groups.
I think we should keep this intelligence in user-space.
Exposing a way to indicate how many groups can be created 'at this level', allows
user-space to determine if its on an RMID-rich machine or a PARTID-rich machine.
If there is a way of moving a group of tasks between control groups, then we'd also need
to expose some indication as to whether the monitors at the old location keep counting
after the move. (which I think is the best way of explaining the difference to user-space)
With these, user-space can change the structure it creates to better fit the resources of
the machine.
Thanks,
James
Powered by blists - more mailing lists