lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3223bd31-2112-0c5e-08d4-7e4942d031ec@amd.com>
Date: Wed, 21 Aug 2024 20:31:40 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
 tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
 kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
 jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
 rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
 jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
 daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
 ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
 maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v6 00/22] x86/resctrl : Support AMD Assignable Bandwidth
 Monitoring Counters (ABMC)

Hi Reinette,

On 8/16/24 16:28, Reinette Chatre wrote:
> Hi Babu,
> 
> On 8/6/24 3:00 PM, Babu Moger wrote:
>>
>> Feature adds following interface files:
>>
>> /sys/fs/resctrl/info/L3_MON/mbm_mode: Reports the list of assignable
>> monitoring features supported. The enclosed brackets indicate which
>> feature is enabled.
> 
> I've been considering this file as a generic file where all future "MBM
> modes"
> can be captured, while this series treats it as specific to "assignable
> monitoring
> features" (btw, should this be "assignable monitoring modes" to match the
> name?).
> Looking closer at this implementation it does make things easier that
> "mbm_mode" is
> specific to "assignable monitoring features" but when doing so I think it
> should have
> a less generic name to avoid the obstacles we have with the existing
> "mon_features".
> Apologies that this goes back to be close to what you had earlier ... maybe
> "mbm_assign_mode"?

Lets see:
#cat /sys/fs/resctrl/info/L3_MON/mbm_mode
[mbm_cntr_assign]  <- This already says 'assign'. Isn't that enough?

default            <-  Default mode is not related assignable features.

I would think mbm_mode is fine. Let me know.

>>
>> /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: Reports the number of monitoring
>> counters available for assignment.
>>
>> /sys/fs/resctrl/info/L3_MON/mbm_control: Reports the resctrl group and
>> monitor
>> status of each group. Assignment state can be updated by writing to the
>> interface.
>>
>> # Examples
>>
>> a. Check if ABMC support is available
>>     #mount -t resctrl resctrl /sys/fs/resctrl/
>>
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>>     [mbm_cntr_assign]
>>     legacy
>>
>>     ABMC feature is detected and it is enabled.
>>
>> b. Check how many ABMC counters are available.
>>
>>     #cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>>     32
>>
>> c. Create few resctrl groups.
>>
>>     # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
>>     # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
>>     # mkdir
>> /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
>>
>>
>> d. This series adds a new interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>     to list and modify the group's monitoring states. File provides
>> single place
>>     to list monitoring states of all the resctrl groups. It makes it
>> easier for
>>     user space to learn about the counters are used without needing to
>> traverse
> 
> "to learn about the counters are used" -> "to learn the counters that are
> used" or
> "to learn about the used counters" or ...?

Sure.

> 
>>     all the groups thus reducing the number of file system calls.
>>
>>     The list follows the following format:
>>
>>     "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>>
>>     Format for specific type of groups:
>>
>>     * Default CTRL_MON group:
>>      "//<domain_id>=<flags>"
>>
>>         * Non-default CTRL_MON group:
>>                 "<CTRL_MON group>//<domain_id>=<flags>"
>>
>>         * Child MON group of default CTRL_MON group:
>>                 "/<MON group>/<domain_id>=<flags>"
>>
>>         * Child MON group of non-default CTRL_MON group:
>>                 "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>>
>>         Flags can be one of the following:
>>
>>          t  MBM total event is enabled.
>>          l  MBM local event is enabled.
>>          tl Both total and local MBM events are enabled.
>>          _  None of the MBM events are enabled
>>
>>     Examples:
>>
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=tl;
>>     
>>     There are four groups and all the groups have local and total
>>     event enabled on domain 0 and 1.
>>
>> e. Update the group assignment states using the interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_control.
>>
>>       The write format is similar to the above list format with addition
>>     of opcode for the assignment operation.
>>          “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
>>
>>     
>>     * Default CTRL_MON group:
>>             "//<domain_id><opcode><flags>"
>>     
>>     * Non-default CTRL_MON group:
>>             "<CTRL_MON group>//<domain_id><opcode><flags>"
>>     
>>     * Child MON group of default CTRL_MON group:
>>             "/<MON group>/<domain_id><opcode><flags>"
>>     
>>     * Child MON group of non-default CTRL_MON group:
>>             "<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>"
>>     
>>     Opcode can be one of the following:
>>     
>>     = Update the assignment to match the flag.
>>     + Assign a new event.
>>     - Unassign a new event.
> 
> Since user space can provide more than one flag the text could be more
> accurate
> noting this. Eg. "Update the assignment to match the flag" -> "Update the
> assignment
> to match the flags.".

Sure.

> 
>>
>>     Flags can be one of the following:
>>
>>          t  MBM total event.
>>          l  MBM local event.
>>          tl Both total and local MBM events.
>>          _  None of the MBM events. Only works with '=' opcode.
> 
> Please take care with the implementation that seems to support a variety of
> combinations. If I understand correctly the implementation support flags
> like,
> for example, "tttt", "llll", "ltlt" ... those may not be an issue but of most
> concern is, for example, a pattern like "_lt" that (unexpectedly) appears to
> result in set of total and local.

Yes. Should we not allow flag combinations with "_"?
I am not very sure about how to go about this.

> 
>>     
>>     Initial group status:
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=tl;
>>
>>     To update the default group to enable only total event on domain 0:
>>     # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>>     Assignment status after the update:
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=t;1=tl;
>>     /child_default_mon_grp/0=tl;1=tl;
>>
>>     To update the MON group child_default_mon_grp to remove total event
>> on domain 1:
>>     # echo "/child_default_mon_grp/1-t" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>>     Assignment status after the update:
>>     $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>     //0=t;1=tl;
>>     /child_default_mon_grp/0=tl;1=l;
>>
>>     To update the MON group
>> non_default_ctrl_mon_grp/child_non_default_mon_grp to
>>     remove both local and total events on domain 1:
>>     # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
>>            /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>>     Assignment status after the update:
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>>     //0=t;1=tl;
>>     /child_default_mon_grp/0=tl;1=l;
>>
>>     To update the default group to add a local event domain 0.
>>     # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>>     Assignment status after the update:
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=tl;1=tl;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=l;
>>
>>     To update the non default CTRL_MON group non_default_ctrl_mon_grp to
>> unassign all
>>     the MBM events on all the domains.
>>     # echo "non_default_ctrl_mon_grp//*=_" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>>     Assignment status after the update:
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>>     non_default_ctrl_mon_grp//0=_;1=_;
>>     non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>>     //0=tl;1=tl;
>>     /child_default_mon_grp/0=tl;1=l;
>>
>>
>> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
>>     There is no change in reading the events with ABMC. If the event is
>> unassigned
>>     when reading, then the read will come back as "Unassigned".
>>     
>>     # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>     779247936
>>     # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>     765207488
>>     
>> g. Check the bandwidth configuration for the group. Note that bandwidth
>>     configuration has a domain scope. Total event defaults to 0x7F (to
>>     count all the events) and local event defaults to 0x15 (to count all
>>     the local numa events). The event bitmap decoding is available at
>>     https://www.kernel.org/doc/Documentation/x86/resctrl.rst
>>     in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>>     
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     0=0x7f;1=0x7f
>>     
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>>     0=0x15;1=0x15
>>     
>> h. Change the bandwidth source for domain 0 for the total event to count
>> only reads.
>>     Note that this change effects total events on the domain 0.
>>     
>>     #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>>     0=0x33;1=0x7F
>>     
>> i. Now read the total event again. The first read will come back with
>> "Unavailable"
>>     status. The subsequent read of mbm_total_bytes will display only the
>> read events.
>>     
>>     #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>     Unavailable
>>     #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>     314101
>>
>> j. Users will have the option to go back to legacy mbm_mode if required.
>>     This can be done using the following command. Note that switching the
>>     mbm_mode will reset all the mbm counters of all resctrl groups.
> 
> "reset all the mbm counters" -> "reset all the MBM counters"

Sure.

> 
>>
>>     # echo "legacy" > /sys/fs/resctrl/info/L3_MON/mbm_mode
>>     # cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>>     mbm_cntr_assign
>>     [legacy]
>>
>>     
>> k. Unmount the resctrl
>>     
>>     #umount /sys/fs/resctrl/
>> ---
>> v6:
>>    We still need to finalize few interface details on mbm_mode and
>> mbm_control
>>    in case of ABMC and Soft-ABMC. We can continue the discussion with
>> this series.
> 
> Could you please list the details that need to be finalized?

1. mbm_mode display
    # cat /sys/fs/resctrl/info/L3_MON/mbm_mode
      mbm_cntr_assign
      [legacy]

     "mbm_cntr_assign"
      Are we sticking with ""mbm_cntr_assign" for ABMC?
      What should we name for soft-ABMC?

2. Also we had some concerns about Individual event assignment(ABMC)
    and group assignment(soft-ABMC)?
    Are the flags "t" and 'l' good for both these modes?

> 
> Thank you
> 
> Reinette
> 

-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ