lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c5f8c64-43f6-4145-b0dc-429603f8ee24@intel.com>
Date: Fri, 22 Nov 2024 13:37:53 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <fenghua.yu@...el.com>, <x86@...nel.org>, <hpa@...or.com>,
	<thuth@...hat.com>, <paulmck@...nel.org>, <rostedt@...dmis.org>,
	<akpm@...ux-foundation.org>, <xiongwei.song@...driver.com>,
	<pawan.kumar.gupta@...ux.intel.com>, <daniel.sneddon@...ux.intel.com>,
	<perry.yuan@....com>, <sandipan.das@....com>, <kai.huang@...el.com>,
	<xiaoyao.li@...el.com>, <seanjc@...gle.com>, <jithu.joseph@...el.com>,
	<brijesh.singh@....com>, <xin3.li@...el.com>, <ebiggers@...gle.com>,
	<andrew.cooper3@...rix.com>, <mario.limonciello@....com>,
	<james.morse@....com>, <tan.shaopeng@...itsu.com>, <tony.luck@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<eranian@...gle.com>, <jpoimboe@...nel.org>, <thomas.lendacky@....com>
Subject: Re: [PATCH v9 08/26] x86/resctrl: Introduce the interface to display
 monitor mode

Hi Babu,

On 11/22/24 10:25 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 11/18/2024 4:07 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 11/18/24 11:04 AM, Moger, Babu wrote:
>>> Hi Reinette,
>>>
>>> On 11/15/24 18:00, Reinette Chatre wrote:
>>>> Hi Babu,
>>>>
>>>> On 10/29/24 4:21 PM, Babu Moger wrote:
>>>>> Introduce the interface file "mbm_assign_mode" to list monitor modes
>>>>> supported.
>>>>>
>>>>> The "mbm_cntr_assign" mode provides the option to assign a counter to
>>>>> an RMID, event pair and monitor the bandwidth as long as it is assigned.
>>>>>
>>>>> On AMD systems "mbm_cntr_assign" is backed by the ABMC (Assignable
>>>>> Bandwidth Monitoring Counters) hardware feature and is enabled by default.
>>>>>
>>>>> The "default" mode is the existing monitoring mode that works without the
>>>>> explicit counter assignment, instead relying on dynamic counter assignment
>>>>> by hardware that may result in hardware not dedicating a counter resulting
>>>>> in monitoring data reads returning "Unavailable".
>>>>>
>>>>> Provide an interface to display the monitor mode on the system.
>>>>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>>>>> [mbm_cntr_assign]
>>>>> default
>>>>>
>>>>> Signed-off-by: Babu Moger <babu.moger@....com>
>>>>> ---
>>
>> ...
>>
>>>> I'm concerned that users with Intel platforms may want to use the "mbm_cntr_assign" mode
>>>> to make the event data "more predictable" and then be concerned when the mode does
>>>> not exist.
>>>>
>>>> As an alternative, is it possible to know the number of hardware counters on AMD systems
>>>> without ABMC? I wonder if we could perhaps always expose num_mbm_cntrs as a way for
>>>> users to know if their platform may be impacted by this type of "unpredictability" (by comparing
>>>> num_mbm_cntrs to num_rmids).
>>>
>>> There is some round about(or hacky) way to find that out number of RMIDs
>>> that can be active.
>>
>> Does this give consistent and accurate data? Is this something that can be added to resctrl?
>> (Reading your other message [1] it does not sound as though it can produce an accurate
>> number on boot.)
>> If not then it will be up to the documentation to be accurate.
>>
>>
>>>>> +
>>>>> +    AMD Platforms with ABMC (Assignable Bandwidth Monitoring Counters) feature
>>>>> +    enable this mode by default so that counters remain assigned even when the
>>>>> +    corresponding RMID is not in use by any processor.
>>>>> +
>>>>> +    "default":
>>>>> +
>>>>> +    In default mode resctrl assumes there is a hardware counter for each
>>>>> +    event within every CTRL_MON and MON group. Reading mbm_total_bytes or
>>>>> +    mbm_local_bytes may report 'Unavailable' if there is no counter associated
>>>>> +    with that event.
>>>>
>>>> If I understand correctly, on AMD platforms without ABMC the events only report
>>>> "Unavailable" if there is no counter assigned at the time of the query. If a counter
>>>> is unassigned and then reassigned then the event count will reset and the user
>>>> will get some data back but it may thus be unpredictable (to match earlier language).
>>>> Is this correct? Any AMD platform in "default" mode may thus be vulnerable to
>>>> "unpredictable" event counts (not just "Unavailable") ... this gets complicated
>>>
>>> Yes. All the AMD systems without ABMC are affected by this problem.
>>>
>>>> because users should be steered to avoid "default" mode if mbm_assign_mode is
>>>> available, while not be made concerned to use "default" mode on Intel where
>>>> mbm_assign_mode is not available.
>>>
>>> Can we add text to clarify this?
>>
>> Please do.
> 
> I think we need to add text about AMD systems. How about this?
> 
> "default":
> In default mode resctrl assumes there is a hardware counter for each
> event within every CTRL_MON and MON group. On AMD systems with 16 more monitoring groups, reading mbm_total_bytes or mbm_local_bytes may report 'Unavailable' if there is no counter associated with that event. It is therefore recommended to use the 'mbm_cntr_assign' mode, if supported."


What is meant with "On AMD systems with 16 more monitoring groups"? First, the language is
not clear, second, you mentioned earlier that there is just a "hacky" way to determine number
of RMIDs that can be active but here "16" is made official in the documentation?

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ