lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 7 Dec 2023 10:12:27 -0600
From:   "Moger, Babu" <babu.moger@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
        fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, dave.hansen@...ux.intel.com,
        James Morse <james.morse@....com>
Cc:     x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
        rdunlap@...radead.org, tj@...nel.org, peterz@...radead.org,
        seanjc@...gle.com, kim.phillips@....com, jmattson@...gle.com,
        ilpo.jarvinen@...ux.intel.com, jithu.joseph@...el.com,
        kan.liang@...ux.intel.com, nikunj@....com,
        daniel.sneddon@...ux.intel.com, pbonzini@...hat.com,
        rick.p.edgecombe@...el.com, rppt@...nel.org,
        maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, eranian@...gle.com,
        peternewman@...gle.com, dhagiani@....com
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature

Hi Reinette,

On 12/6/23 12:49, Reinette Chatre wrote:
> Hi Babu,
> 
> On 12/6/2023 7:40 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 12/5/23 17:17, Reinette Chatre wrote:
>>> (+James)
>>>
>>> Hi Babu,
>>>
>>> On 11/30/2023 4:57 PM, Babu Moger wrote:
>>>> These series adds the support for AMD QoS RMID Pinning feature. It is also
>>>
>>> "These series" - is this series part of a bigger work?
>>
>> No.
>> There are some some plans to optimize rmid_reads. Peter is planning to
>> work on that. But both are independent of each other.
> 
> I would propose that you use "This series" instead to avoid creating
> wrong impression.

Sure.

> 
>>>> a. Check if ABMC support is available
>>>> 	#mount -t resctrl resctrl /sys/fs/resctrl/
>>>> 	#cat /sys/fs/resctrl/info/L3_MON/mon_features 
>>>> 	llc_occupancy
>>>> 	mbm_total_bytes
>>>> 	mbm_total_bytes_config
>>>> 	mbm_local_bytes
>>>> 	mbm_local_bytes_config
>>>> 	abmc_capable ←  Linux kernel detected ABMC feature.
>>>
>>> (Please start thinking about a new name that is not the AMD feature
>>> name. This is added to resctrl filesystem that is the generic interface
>>> used to work with different architectures. This thus needs to be generalized
>>> to what user requires and how it can be accommodated by the hardware ...
>>> this is already expected to be needed by MPAM and having a AMD feature
>>> name could cause confusion.)
>>
>> Yes. Agree.
>>
>> How about "assign_capable"?
> 
> Let's wait to learn more about other use case.
> 
>>
>>>
>>>>
>>>> b. Mount with ABMC support
>>>> 	#umount /sys/fs/resctrl/
>>>> 	#mount  -o abmc -t resctrl resctrl /sys/fs/resctrl/
>>>> 	
>>>
>>> hmmm ... so this requires the user to mount resctrl, determine if the
>>> feature is supported, unmount resctrl, remount resctrl with feature enabled.
>>> Could you please elaborate what prevents this feature from being enabled
>>> without needing to remount resctrl?
>>
>> Spec says
>> "Enabling ABMC: ABMC is enabled by setting L3_QOS_EXT_CFG.ABMC_En=1 (see
>> Figure 19-7). When the state of ABMC_En is changed, it must be changed to
>> the updated value on all logical processors in the QOS Domain.
>> Upon transitions of the ABMC_En the following actions take place:
>> All ABMC assignable bandwidth counters are reset to 0.
>> The L3 default mode bandwidth counters are reset to 0.
>> The L3_QOS_ABMC_CFG MSR is reset to 0."
>>
>> So, all the monitoring group counters will be reset.
>>
>> It is technically possible to enable without remount. But ABMC mode
>> requires few new files(in each group) which I added when mounted with "-o
>> abmc". Thought it is a better option.
>>
>> Otherwise we need to add these files when ABMC is supported(not when
>> enabled). Need to add another file in /sys/fs/resctrl/info/L3_MON to
>> enable the feature on the fly.
>>
>> Both are acceptable options. Any thoughts?
> 
> The new resctrl files in info/ could always be present. For example,
> user space may want to know how many counters are available before
> enabling the feature.
> 
> It is not yet obvious to me that this feature requires new files
> in monitor groups.

There are two MBM events(total and local) in each group.
We should provide an interface to assign each event independently.
User can assign only one event in a group. We should also provide an
option assign both the events in the group. This needs to be done at
resctrl group level.

> 
>>>> c. Read the monitor states. There will be new file "monitor_state"
>>>>    for each monitor group when ABMC feature is enabled. By default,
>>>>    both total and local MBM events are in "unassign" state.
>>>> 	
>>>> 	#cat /sys/fs/resctrl/monitor_state 
>>>> 	total=unassign;local=unassign
>>>> 	
>>>> d. Read the event mbm_total_bytes and mbm_local_bytes. Note that MBA
>>>>    events are not available until the user assigns the events explicitly.
>>>>    Users need to assign the counters to monitor the events in this mode.
>>>> 	
>>>> 	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>>> 	Unavailable
>>>
>>> How is the llc_occupancy event impacted when ABMC is enabled? Can all RMIDs
>>> still be used to track cache occupancy?
>>
>> llc_occupancy event is not impacted by ABMC mode. It can be still used to
>> track cache occupancy.
>>
>>>
>>>> 	
>>>> 	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes 
>>>> 	Unavailable
>>>
>>> I believe that "Unavailable" already has an accepted meaning within current
>>> interface and is associated with temporary failure. Even the AMD spec states "This
>>> is generally a temporary condition and subsequent reads may succeed". In the
>>> scenario above there is no chance that this counter would produce a value later.
>>> I do not think it is ideal to overload existing interface with different meanings
>>> associated with a new hardware specific feature ... something like "Disabled" seems
>>> more appropriate.
>>
>> Hardware still reports it as unavailable. Also, there are some error cases
>> hardware can report unavailable. We may not be able to differentiate that.
> 
> This highlights that this resctrl feature is currently latched to AMD's 
> ABMC. I do not think we should require that this resctrl feature is backed
> by hardware that can support reads of counters that are disabled. A counter
> read really only needs to be sent to hardware if it is enabled.
> 
>>> Considering this we may even consider using these files themselves as a
>>> way to enable the counters if they are disabled. For example, just
>>> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
>>
>> I am not sure about this. This is specific to domain 0. This group can
>> have cpus from multiple domains. I think we should have the interface for
>> all the domains(not for specific domain).
> 
> Are the ABMC registers not per CPU? This is unclear to me at this time
> since changelog of patch #13 states it is per-CPU but yet the code
> uses smp_call_function_any().

Here are the clarifications from hardware engineer about this.

# While configuring the counter, should we have to write (L3_QOS_ABMC_CFG)
on all the logical processors in a domain?

No.  In order to configure a specific counter, you only need to write it
on a  single logical processor in a domain.  Configuring the actual ABMC
counter is a side-effect of the write to this register.  And the actual
ABMC counter configuration is a  global state.

"Each logical processor implements a separate copy of these registers"
identifies that if you write a 5 to L3_QOS_ABMC_CFG on C0T0, you will not
read a 5 from the L3_QOS_ABMC_CFG register on C1T0.


> 
> Even so, this needs to take other use cases into account. So far Peter
> mentioned the scenario where enabling of one counter would do so for all
> events associated with that counter and then there could also be a global
> enable/disable.
> 
>>
>>> to enable this counter. No need for a new "monitor_state". Please note that this
>>> is not an official proposal since there are two other use cases that still need to
>>> be considered as we await James's feedback on how this may work for MPAM and
>>> also how this may be useful on AMD hardware that does not support ABMC but
>>> users may want to get similar benefits ([1])
>>
>> Ok. Lets wait for James comments.
> 
> Reinette
> 

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ