lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05786b1c-cc4e-26ea-581d-3aad3b594f91@amd.com>
Date: Fri, 2 Aug 2024 13:49:06 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>,
 Peter Newman <peternewman@...gle.com>
Cc: babu.moger@....com, corbet@....net, fenghua.yu@...el.com,
 tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 paulmck@...nel.org, rdunlap@...radead.org, tj@...nel.org,
 peterz@...radead.org, yanjiewtw@...il.com, kim.phillips@....com,
 lukas.bulwahn@...il.com, seanjc@...gle.com, jmattson@...gle.com,
 leitao@...ian.org, jpoimboe@...nel.org, rick.p.edgecombe@...el.com,
 kirill.shutemov@...ux.intel.com, jithu.joseph@...el.com,
 kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, maciej.wieczor-retman@...el.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, eranian@...gle.com,
 james.morse@....com
Subject: Re: [PATCH v5 00/20] x86/resctrl : Support AMD Assignable Bandwidth
 Monitoring Counters (ABMC)

Hi Peter/Reinette,

On 8/2/2024 11:13 AM, Reinette Chatre wrote:
> Hi Peter,
> 
> On 8/1/24 3:45 PM, Peter Newman wrote:
>> On Thu, Aug 1, 2024 at 2:50 PM Reinette Chatre
>> <reinette.chatre@...el.com> wrote:
>>> On 7/17/24 10:19 AM, Moger, Babu wrote:
>>>> On 7/12/24 17:03, Reinette Chatre wrote:
>>>>> On 7/3/24 2:48 PM, Babu Moger wrote:
> 
>>>>>> # Examples
>>>>>>
>>>>>> a. Check if ABMC support is available
>>>>>>       #mount -t resctrl resctrl /sys/fs/resctrl/
>>>>>>
>>>>>>       #cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>>>>>>       [abmc]
>>>>>>       legacy
>>>>>>
>>>>>>       Linux kernel detected ABMC feature and it is enabled.
>>>>>
>>>>> How about renaming "abmc" to "mbm_cntrs"? This will match the 
>>>>> num_mbm_cntrs
>>>>> info file and be the final step to make this generic so that another
>>>>> architecture
>>>>> can more easily support assignining hardware counters without 
>>>>> needing to call
>>>>> the feature AMD's "abmc".
>>>>
>>>> I think we aleady settled this with "mbm_cntr_assignable".
>>>>
>>>> For soft-RMID" it will be mbm_sw_assignable.
>>>
>>> Maybe getting a bit long but how about "mbm_cntr_sw_assignable" to match
>>> with the term "mbm_cntr" in accompanying "num_mbm_cntrs"?
>>
>> My users are pushing for a consistent interface regardless of whether
>> counter assignment is implemented in hardware or software, so I would
>> like to avoid exposing implementation differences in the interface
>> where possible.
> 
> This seems a reasonable ask but can we be confident that if hardware
> supports assignable counters then there will never be a reason to use
> software assignable counters? (This needs to also consider how/if Arm
> may use this feature.)
> 
> I am of course not familiar with details of the software implementation
> - could there be benefits to using it even if hardware counters are
> supported?
> 
> What I would like to avoid is future complexity of needing a new 
> mount/config
> option that user space needs to use to select if a single 
> "mbm_cntr_assignable"
> is backed by hardware or software.
> 
>> The main semantic difference with SW assignments is that it is not
>> possible to assign counters to individual events. Because the
>> implementation is assigning RMIDs to groups, assignment results in all
>> events being counted.
>>
>> I was considering introducing a boolean mbm_assign_events node to
>> indicate whether assigning individual events is supported. If true,
>> num_mbm_cntrs indicates the number of events which can be counted,
>> otherwise it indicates the number of groups to which counters can be
>> assigned and attempting to assign a single event is silently upgraded
>> to assigning counters to all events in the group.
> 
> How were you envisioning your users using the control file ("mbm_control")
> in these scenarios? Does this file's interface even work for SW assignment
> scenarios?
> 
> Users should expect consistent interface for "mbm_control" also.
> 
> It sounds to me that a potential "mbm_assign_events" will be false for SW
> assignments. That would mean that "num_mbm_cntrs" will
> contain the number of groups to which counters can be assigned?
> Would user space be required to always enable all flags (enable all 
> events) of
> all domains to the same values ... or would enabling of one flag (one 
> event)
> in one domain automatically result in all flags (all events) enabled for 
> all
> domains ... or would enabling of one flag (one event) in one domain only 
> appear
> to user space to be enabled while in reality all flags/events are 
> actually enabled?
> 
>> However, If we don't expect to see these semantics in any other
>> implementation, these semantics could be implicit in the definition of
>> a SW assignable counter.
> 
> It is not clear to me how implementation differences between hardware
> and software assignment can be hidden from user space. It is possible
> to let user space enable individual events and then silently upgrade it
> to all events. I see two options here, either "mbm_control" needs to
> explicitly show this "silent upgrade" so that user space knows which
> events are actually enabled, or "mbm_control" only shows flags/events 
> enabled
> from user space perspective. In the former scenario, this needs more
> user space support since a generic user space cannot be confident which
> flags are set after writing to "mbm_control". In the latter scenario,
> meaning of "num_mbm_cntrs" becomes unclear since user space is expected
> to rely on it to know which events can be enabled and if some are
> actually "silently enabled" when user space still thinks it needs to be
> enabled the number of available counters becomes vague.
> 
> It is not clear to me how to present hardware and software assignable
> counters with a single consistent interface. Actually, what if the
> "mbm_mode" is what distinguishes how counters are assigned instead of how
> it is backed (hw vs sw)? What if, instead of "mbm_cntr_assignable" and
> "mbm_cntr_sw_assignable" MBM modes the terms "mbm_cntr_event_assignable"
> and "mbm_cntr_group_assignable" is used? Could that replace a
> potential "mbm_assign_events" while also supporting user space in
> interactions with "mbm_control"?

If I understand correctly, current interface might work for both the sw 
and hw assignments.

In case of SW assignment, you need to manage two counters at context 
switch time. One for total event and one for local event. Basically, you 
need to calculate delta for both events. You need to do rmid read for 
both events and then calculate the delta.

If the user assigns only one event you do the calculations only for the 
event user is interested in. That will save cycles as well. In this case 
"mbm_control" will report as one one event is assigned.

In many cases user will not interested in both the events. Also events 
are configurable so users can get what they want with just one event.

Does that make sense?

-- 
- Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ