[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48b595cc-5ffe-4507-bffd-335a60fdaab9@intel.com>
Date: Fri, 3 May 2024 16:24:10 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Babu Moger <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [RFC PATCH v3 00/17] x86/resctrl : Support AMD Assignable
Bandwidth Monitoring Counters (ABMC)
Hi Babu,
On 3/28/2024 6:06 PM, Babu Moger wrote:
> a. Check if ABMC support is available
> #mount -t resctrl resctrl /sys/fs/resctrl/
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign
> [abmc]
> legacy_mbm
>
> Linux kernel detected ABMC feature and it is enabled.
Please note that this adds the "abmc" feature to the resctrl
*filesystem* that supports more architectures than just AMD. Calling the
resctrl filesystem feature "abmc" means that (a) AMD needs to be ok with
other architectures calling their features that are
similar-but-maybe-not-identical-to-AMD-ABMC "abmc", or (b) this needs
a new generic name.
> b. Check how many ABMC counters are available.
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_cntrs
> 32
>
> c. Create few resctrl groups.
>
> # mkdir /sys/fs/resctrl/mon_groups/default_mon1
> # mkdir /sys/fs/resctrl/non_defult_group
Can this be non_default_group instead? Seems like non_defult_group is used
consistently but its spelling is unexpected.
> # mkdir /sys/fs/resctrl/non_defult_group/mon_groups/non_default_mon1
>
> d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> to list and modify the group's assignment states.
>
> The list follows the following format:
>
> * Default CTRL_MON group:
> "//<domain_id>=<assignment_flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id>=<assignment_flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id>=<assignment_flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
>
> Assignment flags can be one of the following:
>
> t MBM total event is assigned
> l MBM local event is assigned
> tl Both total and local MBM events are assigned
> _ None of the MBM events are assigned
>
> Examples:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_defult_group//0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
> non_defult_group/non_default_mon1/0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
> /default_mon1/0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
>
> There are four groups and all the groups have local and total event assigned.
>
> "//" - This is a default CONTROL MON group
>
> "non_defult_group//" - This is non default CONTROL MON group
>
> "/default_mon1/" - This is Child MON group of the defult group
>
> "non_defult_group/non_default_mon1/" - This is child MON group of the non default group
>
> =tl means both total and local events are assigned.
>
> e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control.
>
> The write format is similar to the above list format with addition of
> op-code for the assignment operation.
>
> * Default CTRL_MON group:
> "//<domain_id><op-code><assignment_flags>"
>
> * Non-default CTRL_MON group:
> "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
>
> * Child MON group of default CTRL_MON group:
> "/<MON group>/<domain_id><op-code><assignment_flags>"
>
> * Child MON group of non-default CTRL_MON group:
> "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
>
> Op-code can be one of the following:
>
> = Update the assignment to match the flags
> + Assign a new state
> - Unassign a new state
> _ Unassign all the states
As mentioned in https://lore.kernel.org/lkml/ZjO9hpuLz%2FjJYqvT@e133380.arm.com/
the "_" is not an operator but instead viewed as an part of <assignment_flags>.
It is expected to be used with "=", to unset flags it will be used as below:
echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/0=_" ...
>
>
> Initial group status:
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=tl;1=tl;
> /child_default_mon_grp/0=tl;1=tl;
>
>
> To update the default group to assign only total event.
> # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> //0=t;1=t;
> /child_default_mon_grp/0=tl;1=tl;
As mentioned in https://lore.kernel.org/lkml/330e3391-b917-4a88-bae3-bdcbb8cfd6f4@intel.com/
using "0=t" is expected to only impact domain #0, not all domains. Similar for
other examples below.
>
> To update the MON group child_default_mon_grp to remove local event:
> # echo "/child_default_mon_grp/0-l" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> //0=t;1=t;
> /child_default_mon_grp/0=t;1=t;
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>
> To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
> remove both local and total events:
> # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/0_" >
> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>
> Assignment status after the update:
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> //0=t;1=t;
> /child_default_mon_grp/0=t;1=t;
> non_default_ctrl_mon_grp//0=tl;1=tl;
> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=_;1=_;
>
>
> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
> There is no change in reading the evetns with ABMC. If the event is unassigned
evetns -> events
> when reading, then the read will come back as Unavailable.
>
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 779247936
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> 765207488
>
> g. Users will have the option to go back to legacy_mbm mode if required.
> This can be done using the following command.
>
> # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
> abmc
> [legacy_mbm]
>
This needs a mention about how state is impacted when a user makes this
switch. For example, if switching from "legacy" to abmc ... if there
are fewer than "num counters" monitor groups, will they get counters
assigned dynamically? What happens to feature specific resctrl files?
What happens to the counters themselves, are they reset? What else
happens during this switch?
>
> h. Check the bandwidth configuration for the group. Note that bandwidth
> configuration has a domain scope. Total event defaults to 0x7F (to
> count all the events) and local event defaults to 0x15 (to count all
> the local numa events). The event bitmap decoding is available at
> https://www.kernel.org/doc/Documentation/x86/resctrl.rst
> in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x7f;1=0x7f
>
> #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> 0=0x15;1=0x15
>
> j. Change the bandwidth source for domain 0 for the total event to count only reads.
> Note that this change effects total events on the domain 0.
>
> #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> 0=0x33;1=0x7F
>
> k. Now read the total event again. The mbm_total_bytes should display
> only the read events.
>
> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> 314101
>
> l. Unmount the resctrl
>
> #umount /sys/fs/resctrl/
>
> ---
Reinette
Powered by blists - more mailing lists