[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ce67d8f-e207-4029-8fb3-0bc7deab1e9f@amd.com>
Date: Thu, 7 Dec 2023 10:12:27 -0600
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com,
James Morse <james.morse@....com>
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
rdunlap@...radead.org, tj@...nel.org, peterz@...radead.org,
seanjc@...gle.com, kim.phillips@....com, jmattson@...gle.com,
ilpo.jarvinen@...ux.intel.com, jithu.joseph@...el.com,
kan.liang@...ux.intel.com, nikunj@....com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com,
rick.p.edgecombe@...el.com, rppt@...nel.org,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com,
peternewman@...gle.com, dhagiani@....com
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature
Hi Reinette,
On 12/6/23 12:49, Reinette Chatre wrote:
> Hi Babu,
>
> On 12/6/2023 7:40 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 12/5/23 17:17, Reinette Chatre wrote:
>>> (+James)
>>>
>>> Hi Babu,
>>>
>>> On 11/30/2023 4:57 PM, Babu Moger wrote:
>>>> These series adds the support for AMD QoS RMID Pinning feature. It is also
>>>
>>> "These series" - is this series part of a bigger work?
>>
>> No.
>> There are some some plans to optimize rmid_reads. Peter is planning to
>> work on that. But both are independent of each other.
>
> I would propose that you use "This series" instead to avoid creating
> wrong impression.
Sure.
>
>>>> a. Check if ABMC support is available
>>>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>>> #cat /sys/fs/resctrl/info/L3_MON/mon_features
>>>> llc_occupancy
>>>> mbm_total_bytes
>>>> mbm_total_bytes_config
>>>> mbm_local_bytes
>>>> mbm_local_bytes_config
>>>> abmc_capable ← Linux kernel detected ABMC feature.
>>>
>>> (Please start thinking about a new name that is not the AMD feature
>>> name. This is added to resctrl filesystem that is the generic interface
>>> used to work with different architectures. This thus needs to be generalized
>>> to what user requires and how it can be accommodated by the hardware ...
>>> this is already expected to be needed by MPAM and having a AMD feature
>>> name could cause confusion.)
>>
>> Yes. Agree.
>>
>> How about "assign_capable"?
>
> Let's wait to learn more about other use case.
>
>>
>>>
>>>>
>>>> b. Mount with ABMC support
>>>> #umount /sys/fs/resctrl/
>>>> #mount -o abmc -t resctrl resctrl /sys/fs/resctrl/
>>>>
>>>
>>> hmmm ... so this requires the user to mount resctrl, determine if the
>>> feature is supported, unmount resctrl, remount resctrl with feature enabled.
>>> Could you please elaborate what prevents this feature from being enabled
>>> without needing to remount resctrl?
>>
>> Spec says
>> "Enabling ABMC: ABMC is enabled by setting L3_QOS_EXT_CFG.ABMC_En=1 (see
>> Figure 19-7). When the state of ABMC_En is changed, it must be changed to
>> the updated value on all logical processors in the QOS Domain.
>> Upon transitions of the ABMC_En the following actions take place:
>> All ABMC assignable bandwidth counters are reset to 0.
>> The L3 default mode bandwidth counters are reset to 0.
>> The L3_QOS_ABMC_CFG MSR is reset to 0."
>>
>> So, all the monitoring group counters will be reset.
>>
>> It is technically possible to enable without remount. But ABMC mode
>> requires few new files(in each group) which I added when mounted with "-o
>> abmc". Thought it is a better option.
>>
>> Otherwise we need to add these files when ABMC is supported(not when
>> enabled). Need to add another file in /sys/fs/resctrl/info/L3_MON to
>> enable the feature on the fly.
>>
>> Both are acceptable options. Any thoughts?
>
> The new resctrl files in info/ could always be present. For example,
> user space may want to know how many counters are available before
> enabling the feature.
>
> It is not yet obvious to me that this feature requires new files
> in monitor groups.
There are two MBM events(total and local) in each group.
We should provide an interface to assign each event independently.
User can assign only one event in a group. We should also provide an
option assign both the events in the group. This needs to be done at
resctrl group level.
>
>>>> c. Read the monitor states. There will be new file "monitor_state"
>>>> for each monitor group when ABMC feature is enabled. By default,
>>>> both total and local MBM events are in "unassign" state.
>>>>
>>>> #cat /sys/fs/resctrl/monitor_state
>>>> total=unassign;local=unassign
>>>>
>>>> d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
>>>> events are not available until the user assigns the events explicitly.
>>>> Users need to assign the counters to monitor the events in this mode.
>>>>
>>>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>>> Unavailable
>>>
>>> How is the llc_occupancy event impacted when ABMC is enabled? Can all RMIDs
>>> still be used to track cache occupancy?
>>
>> llc_occupancy event is not impacted by ABMC mode. It can be still used to
>> track cache occupancy.
>>
>>>
>>>>
>>>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>> Unavailable
>>>
>>> I believe that "Unavailable" already has an accepted meaning within current
>>> interface and is associated with temporary failure. Even the AMD spec states "This
>>> is generally a temporary condition and subsequent reads may succeed". In the
>>> scenario above there is no chance that this counter would produce a value later.
>>> I do not think it is ideal to overload existing interface with different meanings
>>> associated with a new hardware specific feature ... something like "Disabled" seems
>>> more appropriate.
>>
>> Hardware still reports it as unavailable. Also, there are some error cases
>> hardware can report unavailable. We may not be able to differentiate that.
>
> This highlights that this resctrl feature is currently latched to AMD's
> ABMC. I do not think we should require that this resctrl feature is backed
> by hardware that can support reads of counters that are disabled. A counter
> read really only needs to be sent to hardware if it is enabled.
>
>>> Considering this we may even consider using these files themselves as a
>>> way to enable the counters if they are disabled. For example, just
>>> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
>>
>> I am not sure about this. This is specific to domain 0. This group can
>> have cpus from multiple domains. I think we should have the interface for
>> all the domains(not for specific domain).
>
> Are the ABMC registers not per CPU? This is unclear to me at this time
> since changelog of patch #13 states it is per-CPU but yet the code
> uses smp_call_function_any().
Here are the clarifications from hardware engineer about this.
# While configuring the counter, should we have to write (L3_QOS_ABMC_CFG)
on all the logical processors in a domain?
No. In order to configure a specific counter, you only need to write it
on a single logical processor in a domain. Configuring the actual ABMC
counter is a side-effect of the write to this register. And the actual
ABMC counter configuration is a global state.
"Each logical processor implements a separate copy of these registers"
identifies that if you write a 5 to L3_QOS_ABMC_CFG on C0T0, you will not
read a 5 from the L3_QOS_ABMC_CFG register on C1T0.
>
> Even so, this needs to take other use cases into account. So far Peter
> mentioned the scenario where enabling of one counter would do so for all
> events associated with that counter and then there could also be a global
> enable/disable.
>
>>
>>> to enable this counter. No need for a new "monitor_state". Please note that this
>>> is not an official proposal since there are two other use cases that still need to
>>> be considered as we await James's feedback on how this may work for MPAM and
>>> also how this may be useful on AMD hardware that does not support ABMC but
>>> users may want to get similar benefits ([1])
>>
>> Ok. Lets wait for James comments.
>
> Reinette
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists