lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <96880c73-7f0b-4a62-8f9f-11042dec92c7@intel.com>
Date: Mon, 19 Aug 2024 07:52:50 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Peter Newman <peternewman@...gle.com>
CC: James Morse <james.morse@....com>, Babu Moger <babu.moger@....com>,
	<x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
	<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
	<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
	<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
	<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
	<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
	<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
	<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
	<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
	<maciej.wieczor-retman@...el.com>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <eranian@...gle.com>, <mingo@...hat.com>,
	<bp@...en8.de>, <corbet@....net>, <dave.hansen@...ux.intel.com>,
	<fenghua.yu@...el.com>, <tglx@...utronix.de>
Subject: Re: [PATCH v6 19/22] x86/resctrl: Introduce the interface to switch
 between monitor modes

Hi Peter and James,

On 8/16/24 11:09 AM, Reinette Chatre wrote:
> Hi Peter,
> 
> On 8/16/24 10:16 AM, Peter Newman wrote:
>> Hi Reinette,
>>
>> On Fri, Aug 16, 2024 at 10:01 AM Reinette Chatre
>> <reinette.chatre@...el.com> wrote:
>>>
>>> Hi James,
>>>
>>> On 8/16/24 9:31 AM, James Morse wrote:
>>>> Hi Babu,
>>>>
>>>> On 06/08/2024 23:00, Babu Moger wrote:
>>>>> Introduce interface to switch between ABMC and legacy modes.
>>>>>
>>>>> By default ABMC is enabled on boot if the feature is available.
>>>>> Provide the interface to go back to legacy mode if required.
>>>>
>>>> I may have missed it on an earlier version ... why would anyone want the non-ABMC
>>>> behaviour on hardware that requires it: counters randomly reset and randomly return
>>>> 'Unavailable'... is that actually useful?
>>>>
>>>> You default this to on, so there isn't a backward compatibility argument here.
>>>>
>>>> It seems like being able to disable this is a source of complexity - is it needed?
>>>
>>> The ability to go back to legacy was added while looking ahead to support the next
>>> "assignable counter" feature that is software based ("soft-RMID" .. "soft-ABMC"?).
>>>
>>> This series adds support for ABMC on recent AMD hardware to address the issue described
>>> in cover letter. This issue also exists on earlier AMD hardware that does not have the ABMC
>>> feature and Peter is working on a software solution to address the issue on non-ABMC hardware.
>>> This software solution is expected to have the same interface as the hardware solution but
>>> earlier discussions revealed that it may introduce extra latency that users may only want to
>>> accept during periods of active monitoring. Thus the option to disable the counter assignment
>>> mode.
>>
>> Sorry again for the soft-RMID/soft-ABMC confusion[1], it was soft-RMID
>> that impacted context switch latency. Soft-ABMC does not require any
>> additional work at context switch.
> 
> No problem. I did read [1] but I do not think I've seen soft-ABMC yet so
> my understanding of what it does is vague.
> 
>> The only disadvantage to soft-ABMC I can think of is that it also
>> limits reading llc_occupancy event counts to "assigned" groups,
>> whereas without it, llc_occupancy works reliably on all RMIDs on AMD
>> hardware.
> 
> hmmm ... keeping original llc_occupancy behavior does seem useful enough
> as motivation to keep the "legacy"/"default" mbm_assign_mode? It does sound
> to me as though soft-ABMC may not be as accurate when it comes to llc_occupancy.
> As I understand the hardware may tag entries in cache with RMID and that has a longer
> lifetime than the tasks that allocated that data into the cache. If soft-ABMC
> permanently associates an RMID with a local and total counter pair but that
> RMID is dynamically assigned to resctrl groups then a group may not always
> get the same RMID ... and thus its llc_occupancy data would be a combination of
> its cache allocations and all the cache allocations of resource groups that had
> that RMID before it. This may need significantly enhanced "limbo" handling?

To expand on this we may have to rework the interface if the counters can be
assigned to events other than MBM.

James: could you please elaborate how you plan to use this feature and if this
interface works for the planned usage?

Peter: considering the previous example [1] where soft-ABMC was using the "mbm_control"
interface I do not think it is ideal to only use the "t" and "l" flags while
llc_occupancy is also enabled/disabled via this interface. We should consider
(a) renaming the control file to indicate larger scope than MBM, (b) add flags
for llc_occupancy. What do you think? I believe this is in line with stated goal
from [1]: "I believe mbm_control should always accurately reflect which events
are being counted."

Reinette

[1] https://lore.kernel.org/lkml/CALPaoCi1CwLy_HbFNOxPfdReEJstd3c+DvOMJHb5P9jBP+iatw@mail.gmail.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ