lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 7 Dec 2023 17:34:19 -0600
From:   "Moger, Babu" <bmoger@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
        corbet@....net, fenghua.yu@...el.com, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        James Morse <james.morse@....com>
Cc:     x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
        rdunlap@...radead.org, tj@...nel.org, peterz@...radead.org,
        seanjc@...gle.com, kim.phillips@....com, jmattson@...gle.com,
        ilpo.jarvinen@...ux.intel.com, jithu.joseph@...el.com,
        kan.liang@...ux.intel.com, nikunj@....com,
        daniel.sneddon@...ux.intel.com, pbonzini@...hat.com,
        rick.p.edgecombe@...el.com, rppt@...nel.org,
        maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, eranian@...gle.com,
        peternewman@...gle.com, dhagiani@....com
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature

Hi Reinette,

On 12/7/2023 5:26 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 12/7/2023 3:07 PM, Moger, Babu wrote:
>> On 12/7/2023 1:29 PM, Reinette Chatre wrote:
>>> On 12/7/2023 8:12 AM, Moger, Babu wrote:
>>>> On 12/6/23 12:49, Reinette Chatre wrote:
>>>>> On 12/6/2023 7:40 AM, Moger, Babu wrote:
>>>>>> On 12/5/23 17:17, Reinette Chatre wrote:
>>>>>>> On 11/30/2023 4:57 PM, Babu Moger wrote:
>>>>>>>> b. Mount with ABMC support
>>>>>>>>      #umount /sys/fs/resctrl/
>>>>>>>>      #mount  -o abmc -t resctrl resctrl /sys/fs/resctrl/
>>>>>>>>      
>>>>>>> hmmm ... so this requires the user to mount resctrl, determine if the
>>>>>>> feature is supported, unmount resctrl, remount resctrl with feature enabled.
>>>>>>> Could you please elaborate what prevents this feature from being enabled
>>>>>>> without needing to remount resctrl?
>>>>>> Spec says
>>>>>> "Enabling ABMC: ABMC is enabled by setting L3_QOS_EXT_CFG.ABMC_En=1 (see
>>>>>> Figure 19-7). When the state of ABMC_En is changed, it must be changed to
>>>>>> the updated value on all logical processors in the QOS Domain.
>>>>>> Upon transitions of the ABMC_En the following actions take place:
>>>>>> All ABMC assignable bandwidth counters are reset to 0.
>>>>>> The L3 default mode bandwidth counters are reset to 0.
>>>>>> The L3_QOS_ABMC_CFG MSR is reset to 0."
>>>>>>
>>>>>> So, all the monitoring group counters will be reset.
>>>>>>
>>>>>> It is technically possible to enable without remount. But ABMC mode
>>>>>> requires few new files(in each group) which I added when mounted with "-o
>>>>>> abmc". Thought it is a better option.
>>>>>>
>>>>>> Otherwise we need to add these files when ABMC is supported(not when
>>>>>> enabled). Need to add another file in /sys/fs/resctrl/info/L3_MON to
>>>>>> enable the feature on the fly.
>>>>>>
>>>>>> Both are acceptable options. Any thoughts?
>>>>> The new resctrl files in info/ could always be present. For example,
>>>>> user space may want to know how many counters are available before
>>>>> enabling the feature.
>>>>>
>>>>> It is not yet obvious to me that this feature requires new files
>>>>> in monitor groups.
>>>> There are two MBM events(total and local) in each group.
>>>> We should provide an interface to assign each event independently.
>>>> User can assign only one event in a group. We should also provide an
>>>> option assign both the events in the group. This needs to be done at
>>>> resctrl group level.
>>> Understood. I would like to start by considering how (if at all) existing
>>> files may be used, thus my example of using mbm_total_bytes, before adding
>>> more files.
>>>
>>>
>>> ...
>>>
>>>>>>>>      #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>>>>>>      Unavailable
>>>>>>> I believe that "Unavailable" already has an accepted meaning within current
>>>>>>> interface and is associated with temporary failure. Even the AMD spec states "This
>>>>>>> is generally a temporary condition and subsequent reads may succeed". In the
>>>>>>> scenario above there is no chance that this counter would produce a value later.
>>>>>>> I do not think it is ideal to overload existing interface with different meanings
>>>>>>> associated with a new hardware specific feature ... something like "Disabled" seems
>>>>>>> more appropriate.
>>>>>> Hardware still reports it as unavailable. Also, there are some error cases
>>>>>> hardware can report unavailable. We may not be able to differentiate that.
>>>>> This highlights that this resctrl feature is currently latched to AMD's
>>>>> ABMC. I do not think we should require that this resctrl feature is backed
>>>>> by hardware that can support reads of counters that are disabled. A counter
>>>>> read really only needs to be sent to hardware if it is enabled.
>>>>>
>>>>>>> Considering this we may even consider using these files themselves as a
>>>>>>> way to enable the counters if they are disabled. For example, just
>>>>>>> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
>>>>>> I am not sure about this. This is specific to domain 0. This group can
>>>>>> have cpus from multiple domains. I think we should have the interface for
>>>>>> all the domains(not for specific domain).
>>>>> Are the ABMC registers not per CPU? This is unclear to me at this time
>>>>> since changelog of patch #13 states it is per-CPU but yet the code
>>>>> uses smp_call_function_any().
>>>> Here are the clarifications from hardware engineer about this.
>>>>
>>>> # While configuring the counter, should we have to write (L3_QOS_ABMC_CFG)
>>>> on all the logical processors in a domain?
>>>>
>>>> No.  In order to configure a specific counter, you only need to write it
>>>> on a  single logical processor in a domain.  Configuring the actual ABMC
>>>> counter is a side-effect of the write to this register.  And the actual
>>>> ABMC counter configuration is a  global state.
>>>>
>>>> "Each logical processor implements a separate copy of these registers"
>>>> identifies that if you write a 5 to L3_QOS_ABMC_CFG on C0T0, you will not
>>>> read a 5 from the L3_QOS_ABMC_CFG register on C1T0.
>>> Thank you for this information. Would reading L3_QOS_ABMC_DSC register on
>>> C1T0 return the configuration written to L3_QOS_ABMC_CFG on C0T0 ?
>> Yes. Because the counter counter configuration is global. Reading L3_QOS_ABMC_DSC will return the configuration of the counter specified by
>>
>> QOS_ABMC_CFG[CtrID].
>
> To confirm, when you say "global" you mean within a domain?

Yes. That is correct.


>
>>> Even so, you do confirm that the counter configuration is per domain. If I
>>> understand correctly the implementation in this series assumes the counters
>>> are programmed identically on all domains, but theoretically the system can support
>>> domains with different counter configurations. For example, if a resource group
>>> is limited to CPUs in one domain it would be unnecessary to consume the other
>>> domain's counters.
>> Yes. It is programmed on all the domains. Separating the domain
>> configuration will require more changes. I am not planning to address
>> in this series.
> That may be ok. The priority is to consider how users want to interact with this
> feature and create a suitable interface to support this. This version may not
> separate domain configuration, but we do not want to create an the interface that
> prevents such an enhancement in the future. Especially since it is already known
> that hardware supports it.

Yes. Understood.

Thanks

Babu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ