[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d093c0bc-dfd2-422a-9d23-2bde68dc6f73@intel.com>
Date: Fri, 16 Aug 2024 14:28:58 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [PATCH v6 00/22] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Babu,
On 8/6/24 3:00 PM, Babu Moger wrote:
>
> Feature adds following interface files:
>
> /sys/fs/resctrl/info/L3_MON/mbm_mode: Reports the list of assignable
> monitoring features supported. The enclosed brackets indicate which
> feature is enabled.
I've been considering this file as a generic file where all future "MBM modes"
can be captured, while this series treats it as specific to "assignable monitoring
features" (btw, should this be "assignable monitoring modes" to match the name?).
Looking closer at this implementation it does make things easier that "mbm_mode" is
specific to "assignable monitoring features" but when doing so I think it should have
a less generic name to avoid the obstacles we have with the existing "mon_features".
Apologies that this goes back to be close to what you had earlier ... maybe
"mbm_assign_mode"?
>
> /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: Reports the number of monitoring
> counters available for assignment.
>
> /sys/fs/resctrl/info/L3_MON/mbm_control: Reports the resctrl group and monitor
> status of each group. Assignment state can be updated by writing to the
> interface.
>
> # Examples
>
> a. Check if ABMC support is available
> #mount -t resctrl resctrl /sys/fs/resctrl/
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_mode
> [mbm_cntr_assign]
> legacy
>
> ABMC feature is detected and it is enabled.
>
> b. Check how many ABMC counters are available.
>
> #cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
> 32
>
> c. Create few resctrl groups.
>
> # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
>
>
> d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_control
> to list and modify the group's monitoring states. File provides single place
> to list monitoring states of all the resctrl groups. It makes it easier for
> user space to learn about the counters are used without needing to traverse
"to learn about the counters are used" -> "to learn the counters that are used" or
"to learn about the used counters" or ...?
> all the groups thus reducing the number of file system calls.
>
> The list follows the following format:
>
> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>
> Format for specific type of groups:
>
> * Default CTRL_MON group:
> "//<domain_id>=<flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id>=<flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id>=<flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>
> Flags can be one of the following:
>
> t MBM total event is enabled.
> l MBM local event is enabled.
> tl Both total and local MBM events are enabled.
> _ None of the MBM events are enabled
>
> Examples:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> There are four groups and all the groups have local and total
> event enabled on domain 0 and 1.
>
> e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_control.
>
> The write format is similar to the above list format with addition
> of opcode for the assignment operation.
> “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
>
>
> * Default CTRL_MON group:
> "//<domain_id><opcode><flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id><opcode><flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id><opcode><flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>"
>
> Opcode can be one of the following:
>
> = Update the assignment to match the flag.
> + Assign a new event.
> - Unassign a new event.
Since user space can provide more than one flag the text could be more accurate
noting this. Eg. "Update the assignment to match the flag" -> "Update the assignment
to match the flags.".
>
> Flags can be one of the following:
>
> t MBM total event.
> l MBM local event.
> tl Both total and local MBM events.
> _ None of the MBM events. Only works with '=' opcode.
Please take care with the implementation that seems to support a variety of
combinations. If I understand correctly the implementation support flags like,
for example, "tttt", "llll", "ltlt" ... those may not be an issue but of most
concern is, for example, a pattern like "_lt" that (unexpectedly) appears to
result in set of total and local.
>
> Initial group status:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> To update the default group to enable only total event on domain 0:
> # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=t;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
> To update the MON group child_default_mon_grp to remove total event on domain 1:
> # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Assignment status after the update:
> $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=t;1=tl;
> /child_default_mon_grp/0=tl;1=l;
>
> To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
> remove both local and total events on domain 1:
> # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
> /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Assignment status after the update:
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> //0=t;1=tl;
> /child_default_mon_grp/0=tl;1=l;
>
> To update the default group to add a local event domain 0.
> # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=l;
>
> To update the non default CTRL_MON group non_default_ctrl_mon_grp to unassign all
> the MBM events on all the domains.
> # echo "non_default_ctrl_mon_grp//*=_" > /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> non_default_ctrl_mon_grp//0=_;1=_;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=l;
>
>
> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
> There is no change in reading the events with ABMC. If the event is unassigned
> when reading, then the read will come back as "Unassigned".
>
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 779247936
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> 765207488
>
> g. Check the bandwidth configuration for the group. Note that bandwidth
> configuration has a domain scope. Total event defaults to 0x7F (to
> count all the events) and local event defaults to 0x15 (to count all
> the local numa events). The event bitmap decoding is available at
> https://www.kernel.org/doc/Documentation/x86/resctrl.rst
> in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x7f;1=0x7f
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> 0=0x15;1=0x15
>
> h. Change the bandwidth source for domain 0 for the total event to count only reads.
> Note that this change effects total events on the domain 0.
>
> #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x33;1=0x7F
>
> i. Now read the total event again. The first read will come back with "Unavailable"
> status. The subsequent read of mbm_total_bytes will display only the read events.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> Unavailable
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 314101
>
> j. Users will have the option to go back to legacy mbm_mode if required.
> This can be done using the following command. Note that switching the
> mbm_mode will reset all the mbm counters of all resctrl groups.
"reset all the mbm counters" -> "reset all the MBM counters"
>
> # echo "legacy" > /sys/fs/resctrl/info/L3_MON/mbm_mode
> # cat /sys/fs/resctrl/info/L3_MON/mbm_mode
> mbm_cntr_assign
> [legacy]
>
>
> k. Unmount the resctrl
>
> #umount /sys/fs/resctrl/
> ---
> v6:
> We still need to finalize few interface details on mbm_mode and mbm_control
> in case of ABMC and Soft-ABMC. We can continue the discussion with this series.
Could you please list the details that need to be finalized?
Thank you
Reinette
Powered by blists - more mailing lists