[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e36699cf-c73e-401b-b770-63eba708df38@intel.com>
Date: Wed, 6 Dec 2023 10:49:13 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>, James Morse <james.morse@....com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
<seanjc@...gle.com>, <kim.phillips@....com>, <jmattson@...gle.com>,
<ilpo.jarvinen@...ux.intel.com>, <jithu.joseph@...el.com>,
<kan.liang@...ux.intel.com>, <nikunj@....com>,
<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
<rick.p.edgecombe@...el.com>, <rppt@...nel.org>,
<maciej.wieczor-retman@...el.com>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <eranian@...gle.com>,
<peternewman@...gle.com>, <dhagiani@....com>
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature
Hi Babu,
On 12/6/2023 7:40 AM, Moger, Babu wrote:
> Hi Reinette,
>
> On 12/5/23 17:17, Reinette Chatre wrote:
>> (+James)
>>
>> Hi Babu,
>>
>> On 11/30/2023 4:57 PM, Babu Moger wrote:
>>> These series adds the support for AMD QoS RMID Pinning feature. It is also
>>
>> "These series" - is this series part of a bigger work?
>
> No.
> There are some some plans to optimize rmid_reads. Peter is planning to
> work on that. But both are independent of each other.
I would propose that you use "This series" instead to avoid creating
wrong impression.
>>> a. Check if ABMC support is available
>>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>> #cat /sys/fs/resctrl/info/L3_MON/mon_features
>>> llc_occupancy
>>> mbm_total_bytes
>>> mbm_total_bytes_config
>>> mbm_local_bytes
>>> mbm_local_bytes_config
>>> abmc_capable ← Linux kernel detected ABMC feature.
>>
>> (Please start thinking about a new name that is not the AMD feature
>> name. This is added to resctrl filesystem that is the generic interface
>> used to work with different architectures. This thus needs to be generalized
>> to what user requires and how it can be accommodated by the hardware ...
>> this is already expected to be needed by MPAM and having a AMD feature
>> name could cause confusion.)
>
> Yes. Agree.
>
> How about "assign_capable"?
Let's wait to learn more about other use case.
>
>>
>>>
>>> b. Mount with ABMC support
>>> #umount /sys/fs/resctrl/
>>> #mount -o abmc -t resctrl resctrl /sys/fs/resctrl/
>>>
>>
>> hmmm ... so this requires the user to mount resctrl, determine if the
>> feature is supported, unmount resctrl, remount resctrl with feature enabled.
>> Could you please elaborate what prevents this feature from being enabled
>> without needing to remount resctrl?
>
> Spec says
> "Enabling ABMC: ABMC is enabled by setting L3_QOS_EXT_CFG.ABMC_En=1 (see
> Figure 19-7). When the state of ABMC_En is changed, it must be changed to
> the updated value on all logical processors in the QOS Domain.
> Upon transitions of the ABMC_En the following actions take place:
> All ABMC assignable bandwidth counters are reset to 0.
> The L3 default mode bandwidth counters are reset to 0.
> The L3_QOS_ABMC_CFG MSR is reset to 0."
>
> So, all the monitoring group counters will be reset.
>
> It is technically possible to enable without remount. But ABMC mode
> requires few new files(in each group) which I added when mounted with "-o
> abmc". Thought it is a better option.
>
> Otherwise we need to add these files when ABMC is supported(not when
> enabled). Need to add another file in /sys/fs/resctrl/info/L3_MON to
> enable the feature on the fly.
>
> Both are acceptable options. Any thoughts?
The new resctrl files in info/ could always be present. For example,
user space may want to know how many counters are available before
enabling the feature.
It is not yet obvious to me that this feature requires new files
in monitor groups.
>>> c. Read the monitor states. There will be new file "monitor_state"
>>> for each monitor group when ABMC feature is enabled. By default,
>>> both total and local MBM events are in "unassign" state.
>>>
>>> #cat /sys/fs/resctrl/monitor_state
>>> total=unassign;local=unassign
>>>
>>> d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
>>> events are not available until the user assigns the events explicitly.
>>> Users need to assign the counters to monitor the events in this mode.
>>>
>>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>> Unavailable
>>
>> How is the llc_occupancy event impacted when ABMC is enabled? Can all RMIDs
>> still be used to track cache occupancy?
>
> llc_occupancy event is not impacted by ABMC mode. It can be still used to
> track cache occupancy.
>
>>
>>>
>>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>> Unavailable
>>
>> I believe that "Unavailable" already has an accepted meaning within current
>> interface and is associated with temporary failure. Even the AMD spec states "This
>> is generally a temporary condition and subsequent reads may succeed". In the
>> scenario above there is no chance that this counter would produce a value later.
>> I do not think it is ideal to overload existing interface with different meanings
>> associated with a new hardware specific feature ... something like "Disabled" seems
>> more appropriate.
>
> Hardware still reports it as unavailable. Also, there are some error cases
> hardware can report unavailable. We may not be able to differentiate that.
This highlights that this resctrl feature is currently latched to AMD's
ABMC. I do not think we should require that this resctrl feature is backed
by hardware that can support reads of counters that are disabled. A counter
read really only needs to be sent to hardware if it is enabled.
>> Considering this we may even consider using these files themselves as a
>> way to enable the counters if they are disabled. For example, just
>> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
>
> I am not sure about this. This is specific to domain 0. This group can
> have cpus from multiple domains. I think we should have the interface for
> all the domains(not for specific domain).
Are the ABMC registers not per CPU? This is unclear to me at this time
since changelog of patch #13 states it is per-CPU but yet the code
uses smp_call_function_any().
Even so, this needs to take other use cases into account. So far Peter
mentioned the scenario where enabling of one counter would do so for all
events associated with that counter and then there could also be a global
enable/disable.
>
>> to enable this counter. No need for a new "monitor_state". Please note that this
>> is not an official proposal since there are two other use cases that still need to
>> be considered as we await James's feedback on how this may work for MPAM and
>> also how this may be useful on AMD hardware that does not support ABMC but
>> users may want to get similar benefits ([1])
>
> Ok. Lets wait for James comments.
Reinette
Powered by blists - more mailing lists