lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4b66ea1c-2f76-4218-a67b-2232b2be6990@amd.com>
Date: Thu, 20 Mar 2025 19:35:56 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
 "Luck, Tony" <tony.luck@...el.com>
Cc: Peter Newman <peternewman@...gle.com>, Dave Martin <Dave.Martin@....com>,
 corbet@....net, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
 paulmck@...nel.org, akpm@...ux-foundation.org, thuth@...hat.com,
 rostedt@...dmis.org, xiongwei.song@...driver.com,
 pawan.kumar.gupta@...ux.intel.com, daniel.sneddon@...ux.intel.com,
 jpoimboe@...nel.org, perry.yuan@....com, sandipan.das@....com,
 kai.huang@...el.com, xiaoyao.li@...el.com, seanjc@...gle.com,
 xin3.li@...el.com, andrew.cooper3@...rix.com, ebiggers@...gle.com,
 mario.limonciello@....com, james.morse@....com, tan.shaopeng@...itsu.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 maciej.wieczor-retman@...el.com, eranian@...gle.com
Subject: Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth
 Monitoring Counters (ABMC)

Hi Reinette,

On 3/20/2025 5:35 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 3/20/25 11:12 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 3/19/25 13:36, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 3/14/25 9:18 AM, Moger, Babu wrote:
>>>> On 3/13/2025 4:21 PM, Reinette Chatre wrote:
>>>>> On 3/13/25 1:13 PM, Moger, Babu wrote:
>>>>>> On 3/13/25 11:08, Reinette Chatre wrote:
>>>>>>> On 3/12/25 11:14 AM, Moger, Babu wrote:
>>>>>>>> On 3/12/25 12:14, Reinette Chatre wrote:
>>>>>>>>> On 3/12/25 9:03 AM, Moger, Babu wrote:
>>>>>>>>>> On 3/12/25 10:07, Reinette Chatre wrote:
>>>>>
>>>>>
>>>>>> Here are the steps. Just copying steps from Peters proposal.
>>>>>> https://lore.kernel.org/lkml/CALPaoCiii0vXOF06mfV=kVLBzhfNo0SFqt4kQGwGSGVUqvr2Dg@mail.gmail.com/
>>>>>
>>>>> Thank you very much for detailing the steps. It is starting the fall into place
>>>>> for me.
>>>>>
>>>>>>
>>>>>>
>>>>>> 1. Mount the resctrl
>>>>>>      mount -t resctrl resctrl /sys/fs/resctrl
>>>>>
>>>>> I assume that on ABMC system the plan remains to have ABMC enabled by default, which
>>>>> will continue to depend on BMEC.
>>>>
>>>> Yes. ABMC will be enabled by default. ABMC will use the configurations from info/L3_MON/counter_configs. ABMC will not depend on BMEC.
>>>
>>> I see. The previous dependency was thus just something enforced by OS to support the
>>> chosen implementation?
>>
>> Yes. That is correct. We went that route mainly not to change the
>> rmid_read operation.
>>
>> With ABMC, we need to set Extended EVTID and ABMC bit in QM_EVTSEL
>> register while reading the cntr_id events. Will add those patches in next
>> version to make it clear.
> 
> Thank you.
> 
>>
>>> Looks like the two features share some registers.
>>>
>>>>
>>>>> How would the existing BMEC implementation be impacted in this case?
>>>>
>>>> BMEC will only work with pre-ABMC(or default) mode.
>>>
>>> ok. Does this mean that if a user boots kernel with "rdt=!bmec" then ABMC will keep working?
>>
>> Yes. That is correct.
> 
> Just to confirm and bring the two email threads together ... it sounds like the
> expectation is that existing users of BMEC are expected to use mon_features to
> know if mbm_{total,local}_bytes_config are supported. If system supports ABMC
> then BMEC will not be available and thus mon_features will not contain
> mbm_{total,local}_bytes_config. Existing users that rely on
> mbm_{total,local}_bytes_config will experience failures and are expected
> to switch to ABMC?

Yes. Exactly.

> 
> 
>>
>>>
>>>
>>>>> Without any changes to BMEC support the mbm_total_bytes_config and mbm_local_bytes_config
>>>>> files will remain and user space may continue to use them to change the event
>>>>> configurations with confusing expecations/results on an ABMC system.
>>>>>
>>>>> One possibility may be that a user may see below on ABMC system even if BMEC is supported:
>>>>> # cat /sys/fs/resctrl/info/L3_MON/mon_features
>>>>> llc_occupancy
>>>>> mbm_total_bytes
>>>>> mbm_local_bytes
>>>>>
>>>>> With the above a user cannot be expected to want to interact with mbm_total_bytes_config
>>>>> and mbm_local_bytes_config, which may be the simplest to do.
>>>>
>>>> yes.
> 
> 
> ...
> 
>>>>>
>>>>> I do think, and Peter also mentioned [1] this, that it may be useful,
>>>>> to "put a group/resource-scoped assign_* file higher in the hierarchy
>>>>> to make it easier for users who want to configure all domains the
>>>>> same for a group."
>>>>>
>>>>> Placing *additional* files higher in hierarchy (used to manage counters in all
>>>>> domains) may be more useful that trying to provide the shared/exclusive/unassign
>>>>> in one file per domain.
>>>>
>>>> Yea. To make it better we can add "mon_l3_assignments" in groups main directory. We can do all the operation in just one file.
>>>>
>>>> https://lore.kernel.org/lkml/efb5293f-b0ef-4c94-bf10-9ca7ebb3b53f@amd.com/
>>>
>>> I am hesitant to respond to that message considering the corporate preamble that
>>> sneaked in so I'll just add some thoughts here:
>>
>> Yea. I noticed it later. Will take care next time.
>>
>>>
>>> Having the file higher in hierarchy does seem more useful. It may be useful to reduce
>>> amount of parsing to get to needed information. Compare with below two examples that can
>>> be used to convey the same information:
>>>
>>> # cat /sys/fs/resctrl/test/mon_L3_assignments
>>> mbm_total_bytes: 0=unassigned; 1=unassigned
>>> mbm_local_bytes: 0=unassigned; 1=unassigned
>>>
>>> #cat /sys/fs/resctrl/test/mon_L3_assignments
>>> 0=_; 1=_
>>>
>>> We need to take care that it is always clear what "0" or "1" means ...
>>> Tony has been mentioning a lot of interesting things about scope
>>> changes. I assume the "L3" in mon_L3_assignments will dictate the scope?
>>
>> I didnt think about the scope here. I was thinking of changing it to
>> "mbm_assignments".
> 
> ah, I see, not a general "monitoring" file but specific to MBM. This still
> may encounter difficulty if AMD does something like SNC where MBM could
> be done per numa node. Perhaps we could constrain this even more with a
> "mbm_L3_assignments". If anything ever shows up that need to do MBM
> counter assignment at some other scope then at least we have the option
> to create another file "mbm_?_assignments".

Yes. Sounds good to me.

> 
>>
>>>
>>> With a syntax like above the needed information can be presented in one line.
>>> For example,
>>>
>>> #cat /sys/fs/resctrl/test/mon_L3_assignments
>>> 0=mbm_total_bytes; 1=mbm_local_bytes
>>>
>>> The caveat is that is only for assigned counters, not shared, so this needs
>>> to change.
>>>
>>> To support shared assignment ...  I wonder if it will be useful to users to
>>> get the information on which other monitor groups the counter is shared _with_?
>>>
>>> Some examples:
>>>
>>> a) Just indicate whether a counter is shared or dedicated. (Introduce flags).
>>> #cat /sys/fs/resctrl/test/mon_L3_assignments
>>> 0=mbm_total_bytes:s; 1=mbm_local_bytes:d
>>>
>>> b) Indicate which groups a counter is shared with:
>>> #cat /sys/fs/resctrl/testA/mon_L3_assignments
>>> 0=mbm_total_bytes:s(testB); 1=mbm_local_bytes:d(not needed but perhaps empty for consistent interface?)
>>> #cat /sys/fs/resctrl/testB/mon_L3_assignments
>>> 0=mbm_total_bytes:s(testA); 1=mbm_local_bytes:d(?)
>>
>> This format does not tell what is going on with all other domains. We need
>> to display all the domains. I think that is important because users need
>> to know what to expect reading the events on specific domains.
>>
>> I think we need to convey all the following information to the user.
>>
>> 1. Event Configuation: What is event configuration applied here?
>>
>> 2. Domains: To which all the domains the configaration is applied?
>>     This is useful in multi-domain configuration.
>>     We also need to know if which domains are assigned or unassigned.
>>
>> 3. Type of assignment: Exclusive(e or d) or Shared(s) or Unassigned(_)
>>
>> 4. Finally: Where to read the events?
>>   This is important when we add "mkdir" support in the future,
>> mon_data/mon_l3_*/config_name will be created.
>>
>>
>> With that in mind this might be helpful.
>>
>> # cat /sys/fs/resctrl/test/mon_L3_assignments
>> mbm_total_bytes: 0=e; 1=_
>> mbm_local_bytes: 0=_; 1=s
>>
>> This format tells the user all the information.
>> mbm_total_bytes and mbm_local_bytes configurations are applied and
>> configuration are coming from counter_configs.
>>
>> User can read the events in
>> mon_data/mon_L3_*/mbm_total_bytes
>> mon_data/mon_L3_*/mbm_local_bytes
>>
>> mbm_total_bytes is assigned on domain 0 and not on domain 1.
>> Reading the mbm_total_bytes on domain 1 will report "unassigned".
>>
>> mbm_local_bytes is not assigned on domain 0 and assigned on domain 1.
>> Reading the mbm_local_bytes on domain 0 will report "unassigned".
> 
> Thank you very much for spelling it out. Much appreciated. This looks good to me.
> Please include your list of requirements for interface in the cover-letter and/or
> patch that introduces the interface.

Sure. Will do.

> 
>>
>> I dont have much information on shared assignment at this point. Dont know
>> if we can display shared group.
> 
> The proposed interface accommodates shared counters. The expectation is that
> users can keep track themselves and if not, then the information can be
> obtained with a read of every group's counter assignment. The issue here is
> that this may worst case need a large number of file operations if expectation
> is that it will still be possible to create num RMID monitoring groups.
> 
> Using files inside monitor group for this information may actually not be ideal.
> If this information is needed then we could perhaps add a new file. For
> example:
> /sys/fs/resctrl/info/L3_MON/counter_configs/mbm_total_bytes/<file reporting which monitor groups share this counter configuration in different domains>
> Of course, I do not know if this will be required and this seems manageable as
> a later enhancement if needed.
> 

Yes. It can be done this way.

Thanks
Babu


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ