lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1a93f4e3-d3d2-4764-90d1-728b9cb70481@intel.com>
Date: Tue, 26 Nov 2024 15:56:14 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <tglx@...utronix.de>,
	<mingo@...hat.com>, <bp@...en8.de>, <dave.hansen@...ux.intel.com>
CC: <fenghua.yu@...el.com>, <x86@...nel.org>, <hpa@...or.com>,
	<thuth@...hat.com>, <paulmck@...nel.org>, <rostedt@...dmis.org>,
	<akpm@...ux-foundation.org>, <xiongwei.song@...driver.com>,
	<pawan.kumar.gupta@...ux.intel.com>, <daniel.sneddon@...ux.intel.com>,
	<perry.yuan@....com>, <sandipan.das@....com>, <kai.huang@...el.com>,
	<xiaoyao.li@...el.com>, <seanjc@...gle.com>, <jithu.joseph@...el.com>,
	<brijesh.singh@....com>, <xin3.li@...el.com>, <ebiggers@...gle.com>,
	<andrew.cooper3@...rix.com>, <mario.limonciello@....com>,
	<james.morse@....com>, <tan.shaopeng@...itsu.com>, <tony.luck@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<eranian@...gle.com>, <jpoimboe@...nel.org>, <thomas.lendacky@....com>
Subject: Re: [PATCH v9 14/26] x86/resctrl: Introduce interface to display
 number of free counters

Hi Babu,

On 11/26/24 3:31 PM, Moger, Babu wrote:
> On 11/25/2024 1:00 PM, Reinette Chatre wrote:
>> On 11/22/24 3:36 PM, Moger, Babu wrote:
>>> On 11/21/2024 3:12 PM, Reinette Chatre wrote:
>>>> On 11/19/24 11:20 AM, Moger, Babu wrote:
>>>>> On 11/15/24 18:31, Reinette Chatre wrote:
>>>>>> On 10/29/24 4:21 PM, Babu Moger wrote:
>>>>>>> Provide the interface to display the number of free monitoring counters
>>>>>>> available for assignment in each doamin when mbm_cntr_assign is supported.
>>>>>>>
>>>>>>> Signed-off-by: Babu Moger <babu.moger@....com>
>>>>>>> ---
>>>>>>> v9: New patch.
>>>>>>> ---
>>>>>>>    Documentation/arch/x86/resctrl.rst     |  4 ++++
>>>>>>>    arch/x86/kernel/cpu/resctrl/monitor.c  |  1 +
>>>>>>>    arch/x86/kernel/cpu/resctrl/rdtgroup.c | 33 ++++++++++++++++++++++++++
>>>>>>>    3 files changed, 38 insertions(+)
>>>>>>>
>>>>>>> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
>>>>>>> index 2f3a86278e84..2bc58d974934 100644
>>>>>>> --- a/Documentation/arch/x86/resctrl.rst
>>>>>>> +++ b/Documentation/arch/x86/resctrl.rst
>>>>>>> @@ -302,6 +302,10 @@ with the following files:
>>>>>>>        memory bandwidth tracking to a single memory bandwidth event per
>>>>>>>        monitoring group.
>>>>>>>    +"available_mbm_cntrs":
>>>>>>> +    The number of free monitoring counters available assignment in each domain
>>>>>>
>>>>>> "The number of free monitoring counters available assignment" -> "The number of monitoring
>>>>>> counters available for assignment"?
>>>>>>
>>>>>> (not taking into account how text may change after addressing Peter's feedback)
>>>>>
>>>>> How about this?
>>>>>
>>>>> "The number of monitoring counters available for assignment in each domain
>>>>> when the architecture supports mbm_cntr_assign mode. There are a total of
>>>>> "num_mbm_cntrs" counters are available for assignment. Counters can be
>>>>> assigned or unassigned individually in each domain. A counter is available
>>>>> for new assignment if it is unassigned in all domains."
>>>>
>>>> Please consider the context of this paragraph. It follows right after the description
>>>> of "num_mbm_cntrs" that states "Up to two counters can be assigned per monitoring group".
>>>> I think it is confusing to follow that with a paragraph that states "Counters can be
>>>> assigned or unassigned individually in each domain." I wonder if it may be helpful to
>>>> use a different term ... for example a counter is *assigned* to an event of a monitoring
>>>> group but this assignment may be to specified (not yet supported) or all (this work) domains while
>>>> it is only *programmed*/*activated* to specified domains. Of course, all of this documentation
>>>> needs to remain coherent if future work decides to indeed support per-domain assignment.
>>>>
>>>
>>> Little bit lost here. Please help me.
>>
>> I think this highlights the uncertainty this interface brings. How do you expect users
>> to use this interface? At this time I think this interface can create a lot of confusion.
>> For example, consider a hypothetical system with three domains and four counters that
>> has the following state per mbm_assign_control:
>>
>> //0=tl;1=_;2=l #default group uses counters 0 and 1 to monitor total and local MBM
>> /m1/0=_;1=t;2=t #monitor group m1 uses counter 2, just for total MBM
>> /m2/0=l;1=_;2=l #monitor group m2 uses counter 3, just for local MBM
>> /m3/0=_;1=_;2=_
>>
>> Since, in this system there are only four counters available, and
>> they have all been assigned, then there are no new counters available for
>> assignment.
>>
>> If I understand correctly, available_mbm_cntrs will read:
>> 0=1;1=3;2=1
> 
> Yes. Exactly. This causes confusion to the user.
>>
>> How is a user to interpret the above numbers? It does not reflect
>> that no counter can be assigned to m3, instead it reflects which of the
>> already assigned counters still need to be activated on domains.
>> If, for example, a user is expected to use this file to know how
>> many counters can still be assigned, should it not reflect the actual
>> available counters. In the above scenario it will then be:
>> 0=0;1=0;2=0
> 
> We can also just print
> #cat available_mbm_cntrs
> 0
> 
> The domain specific information is not important here.
> That was my original idea. We can go back to that definition. That is more clear to the user.

Tony's response [1] still applies.

I believe Tony's suggestion [2] considered that the available counters will be the
same for every domain for this implementation. That is why my example noted:
"0=0;1=0;2=0"

The confusion surrounding the global allocator seems to be prevalent ([3], [4]) as folks
familiar with resctrl attempt to digest the work. The struggle to make this documentation clear
makes me more concerned how this feature will be perceived by users who are not as familiar with
resctrl internals. I think that it may be worth it to take a moment and investigate what it will take
to implement a per-domain counter allocator. The hardware supports it and I suspect that the upfront
work to do the enabling will make it easier for users to adopt and understand the feature.

What do you think?

Reinette

[1] https://lore.kernel.org/all/SJ1PR11MB6083DC9EA6D323356E957A87FC4E2@SJ1PR11MB6083.namprd11.prod.outlook.com/
[2] https://lore.kernel.org/all/SJ1PR11MB6083583A24FA3B3B7C2DCD64FC442@SJ1PR11MB6083.namprd11.prod.outlook.com/
[3] https://lore.kernel.org/all/ZwmadFbK--Qb8qWP@agluck-desk3.sc.intel.com/
[4] https://lore.kernel.org/all/CALPaoCh1BWdWww8Kztd13GBaY9mMeZX268fOQgECRytiKm-nPQ@mail.gmail.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ