lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a6338e5d-ab94-8da8-3127-3ce90997bf65@amd.com>
Date:   Thu, 7 Dec 2023 17:07:15 -0600
From:   "Moger, Babu" <bmoger@....com>
To:     Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
        corbet@....net, fenghua.yu@...el.com, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        James Morse <james.morse@....com>
Cc:     x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
        rdunlap@...radead.org, tj@...nel.org, peterz@...radead.org,
        seanjc@...gle.com, kim.phillips@....com, jmattson@...gle.com,
        ilpo.jarvinen@...ux.intel.com, jithu.joseph@...el.com,
        kan.liang@...ux.intel.com, nikunj@....com,
        daniel.sneddon@...ux.intel.com, pbonzini@...hat.com,
        rick.p.edgecombe@...el.com, rppt@...nel.org,
        maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, eranian@...gle.com,
        peternewman@...gle.com, dhagiani@....com
Subject: Re: [PATCH 00/15] x86/resctrl : Support AMD QoS RMID Pinning feature

Hi Reinette,

On 12/7/2023 1:29 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 12/7/2023 8:12 AM, Moger, Babu wrote:
>> On 12/6/23 12:49, Reinette Chatre wrote:
>>> On 12/6/2023 7:40 AM, Moger, Babu wrote:
>>>> On 12/5/23 17:17, Reinette Chatre wrote:
>>>>> On 11/30/2023 4:57 PM, Babu Moger wrote:
>
>>>>>> b. Mount with ABMC support
>>>>>> 	#umount /sys/fs/resctrl/
>>>>>> 	#mount  -o abmc -t resctrl resctrl /sys/fs/resctrl/
>>>>>> 	
>>>>> hmmm ... so this requires the user to mount resctrl, determine if the
>>>>> feature is supported, unmount resctrl, remount resctrl with feature enabled.
>>>>> Could you please elaborate what prevents this feature from being enabled
>>>>> without needing to remount resctrl?
>>>> Spec says
>>>> "Enabling ABMC: ABMC is enabled by setting L3_QOS_EXT_CFG.ABMC_En=1 (see
>>>> Figure 19-7). When the state of ABMC_En is changed, it must be changed to
>>>> the updated value on all logical processors in the QOS Domain.
>>>> Upon transitions of the ABMC_En the following actions take place:
>>>> All ABMC assignable bandwidth counters are reset to 0.
>>>> The L3 default mode bandwidth counters are reset to 0.
>>>> The L3_QOS_ABMC_CFG MSR is reset to 0."
>>>>
>>>> So, all the monitoring group counters will be reset.
>>>>
>>>> It is technically possible to enable without remount. But ABMC mode
>>>> requires few new files(in each group) which I added when mounted with "-o
>>>> abmc". Thought it is a better option.
>>>>
>>>> Otherwise we need to add these files when ABMC is supported(not when
>>>> enabled). Need to add another file in /sys/fs/resctrl/info/L3_MON to
>>>> enable the feature on the fly.
>>>>
>>>> Both are acceptable options. Any thoughts?
>>> The new resctrl files in info/ could always be present. For example,
>>> user space may want to know how many counters are available before
>>> enabling the feature.
>>>
>>> It is not yet obvious to me that this feature requires new files
>>> in monitor groups.
>> There are two MBM events(total and local) in each group.
>> We should provide an interface to assign each event independently.
>> User can assign only one event in a group. We should also provide an
>> option assign both the events in the group. This needs to be done at
>> resctrl group level.
> Understood. I would like to start by considering how (if at all) existing
> files may be used, thus my example of using mbm_total_bytes, before adding
> more files.
>
>
> ...
>
>>>>>> 	#cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>>>> 	Unavailable
>>>>> I believe that "Unavailable" already has an accepted meaning within current
>>>>> interface and is associated with temporary failure. Even the AMD spec states "This
>>>>> is generally a temporary condition and subsequent reads may succeed". In the
>>>>> scenario above there is no chance that this counter would produce a value later.
>>>>> I do not think it is ideal to overload existing interface with different meanings
>>>>> associated with a new hardware specific feature ... something like "Disabled" seems
>>>>> more appropriate.
>>>> Hardware still reports it as unavailable. Also, there are some error cases
>>>> hardware can report unavailable. We may not be able to differentiate that.
>>> This highlights that this resctrl feature is currently latched to AMD's
>>> ABMC. I do not think we should require that this resctrl feature is backed
>>> by hardware that can support reads of counters that are disabled. A counter
>>> read really only needs to be sent to hardware if it is enabled.
>>>
>>>>> Considering this we may even consider using these files themselves as a
>>>>> way to enable the counters if they are disabled. For example, just
>>>>> "echo 1 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes" can be used
>>>> I am not sure about this. This is specific to domain 0. This group can
>>>> have cpus from multiple domains. I think we should have the interface for
>>>> all the domains(not for specific domain).
>>> Are the ABMC registers not per CPU? This is unclear to me at this time
>>> since changelog of patch #13 states it is per-CPU but yet the code
>>> uses smp_call_function_any().
>> Here are the clarifications from hardware engineer about this.
>>
>> # While configuring the counter, should we have to write (L3_QOS_ABMC_CFG)
>> on all the logical processors in a domain?
>>
>> No.  In order to configure a specific counter, you only need to write it
>> on a  single logical processor in a domain.  Configuring the actual ABMC
>> counter is a side-effect of the write to this register.  And the actual
>> ABMC counter configuration is a  global state.
>>
>> "Each logical processor implements a separate copy of these registers"
>> identifies that if you write a 5 to L3_QOS_ABMC_CFG on C0T0, you will not
>> read a 5 from the L3_QOS_ABMC_CFG register on C1T0.
> Thank you for this information. Would reading L3_QOS_ABMC_DSC register on
> C1T0 return the configuration written to L3_QOS_ABMC_CFG on C0T0 ?

Yes. Because the counter counter configuration is global. Reading 
L3_QOS_ABMC_DSC will return the configuration of the counter specified by

QOS_ABMC_CFG[CtrID].

>
> Even so, you do confirm that the counter configuration is per domain. If I
> understand correctly the implementation in this series assumes the counters
> are programmed identically on all domains, but theoretically the system can support
> domains with different counter configurations. For example, if a resource group
> is limited to CPUs in one domain it would be unnecessary to consume the other
> domain's counters.
Yes. It is programmed on all the domains. Separating the domain 
configuration will require more changes. I am not planning to address in 
this series.
>
> This also ties into what this feature may morph into when considering the
> non-ABMC AMD hardware needing similar interface as well as MPAM. I understand
> for MPAM that resources are required for a counter but I do not know their
> scope.
>
> Reinette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ