[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3223bd31-2112-0c5e-08d4-7e4942d031ec@amd.com>
Date: Wed, 21 Aug 2024 20:31:40 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v6 00/22] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Reinette,
On 8/16/24 16:28, Reinette Chatre wrote:
> Hi Babu,
>
> On 8/6/24 3:00 PM, Babu Moger wrote:
>>
>> Feature adds following interface files:
>>
>> /sys/fs/resctrl/info/L3_MON/mbm_mode: Reports the list of assignable
>> monitoring features supported. The enclosed brackets indicate which
>> feature is enabled.
>
> I've been considering this file as a generic file where all future "MBM
> modes"
> can be captured, while this series treats it as specific to "assignable
> monitoring
> features" (btw, should this be "assignable monitoring modes" to match the
> name?).
> Looking closer at this implementation it does make things easier that
> "mbm_mode" is
> specific to "assignable monitoring features" but when doing so I think it
> should have
> a less generic name to avoid the obstacles we have with the existing
> "mon_features".
> Apologies that this goes back to be close to what you had earlier ... maybe
> "mbm_assign_mode"?
Lets see:
#cat /sys/fs/resctrl/info/L3_MON/mbm_mode
[mbm_cntr_assign] <- This already says 'assign'. Isn't that enough?
default <- Default mode is not related assignable features.
I would think mbm_mode is fine. Let me know.
>>
>> /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs: Reports the number of monitoring
>> counters available for assignment.
>>
>> /sys/fs/resctrl/info/L3_MON/mbm_control: Reports the resctrl group and
>> monitor
>> status of each group. Assignment state can be updated by writing to the
>> interface.
>>
>> # Examples
>>
>> a. Check if ABMC support is available
>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>> [mbm_cntr_assign]
>> legacy
>>
>> ABMC feature is detected and it is enabled.
>>
>> b. Check how many ABMC counters are available.
>>
>> #cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>> 32
>>
>> c. Create few resctrl groups.
>>
>> # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
>> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
>> # mkdir
>> /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
>>
>>
>> d. This series adds a new interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>> to list and modify the group's monitoring states. File provides
>> single place
>> to list monitoring states of all the resctrl groups. It makes it
>> easier for
>> user space to learn about the counters are used without needing to
>> traverse
>
> "to learn about the counters are used" -> "to learn the counters that are
> used" or
> "to learn about the used counters" or ...?
Sure.
>
>> all the groups thus reducing the number of file system calls.
>>
>> The list follows the following format:
>>
>> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>>
>> Format for specific type of groups:
>>
>> * Default CTRL_MON group:
>> "//<domain_id>=<flags>"
>>
>> * Non-default CTRL_MON group:
>> "<CTRL_MON group>//<domain_id>=<flags>"
>>
>> * Child MON group of default CTRL_MON group:
>> "/<MON group>/<domain_id>=<flags>"
>>
>> * Child MON group of non-default CTRL_MON group:
>> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>>
>> Flags can be one of the following:
>>
>> t MBM total event is enabled.
>> l MBM local event is enabled.
>> tl Both total and local MBM events are enabled.
>> _ None of the MBM events are enabled
>>
>> Examples:
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=tl;1=tl;
>> /child_default_mon_grp/0=tl;1=tl;
>>
>> There are four groups and all the groups have local and total
>> event enabled on domain 0 and 1.
>>
>> e. Update the group assignment states using the interface file
>> /sys/fs/resctrl/info/L3_MON/mbm_control.
>>
>> The write format is similar to the above list format with addition
>> of opcode for the assignment operation.
>> “<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>”
>>
>>
>> * Default CTRL_MON group:
>> "//<domain_id><opcode><flags>"
>>
>> * Non-default CTRL_MON group:
>> "<CTRL_MON group>//<domain_id><opcode><flags>"
>>
>> * Child MON group of default CTRL_MON group:
>> "/<MON group>/<domain_id><opcode><flags>"
>>
>> * Child MON group of non-default CTRL_MON group:
>> "<CTRL_MON group>/<MON group>/<domain_id><opcode><flags>"
>>
>> Opcode can be one of the following:
>>
>> = Update the assignment to match the flag.
>> + Assign a new event.
>> - Unassign a new event.
>
> Since user space can provide more than one flag the text could be more
> accurate
> noting this. Eg. "Update the assignment to match the flag" -> "Update the
> assignment
> to match the flags.".
Sure.
>
>>
>> Flags can be one of the following:
>>
>> t MBM total event.
>> l MBM local event.
>> tl Both total and local MBM events.
>> _ None of the MBM events. Only works with '=' opcode.
>
> Please take care with the implementation that seems to support a variety of
> combinations. If I understand correctly the implementation support flags
> like,
> for example, "tttt", "llll", "ltlt" ... those may not be an issue but of most
> concern is, for example, a pattern like "_lt" that (unexpectedly) appears to
> result in set of total and local.
Yes. Should we not allow flag combinations with "_"?
I am not very sure about how to go about this.
>
>>
>> Initial group status:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=tl;1=tl;
>> /child_default_mon_grp/0=tl;1=tl;
>>
>> To update the default group to enable only total event on domain 0:
>> # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=t;1=tl;
>> /child_default_mon_grp/0=tl;1=tl;
>>
>> To update the MON group child_default_mon_grp to remove total event
>> on domain 1:
>> # echo "/child_default_mon_grp/1-t" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Assignment status after the update:
>> $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=t;1=tl;
>> /child_default_mon_grp/0=tl;1=l;
>>
>> To update the MON group
>> non_default_ctrl_mon_grp/child_non_default_mon_grp to
>> remove both local and total events on domain 1:
>> # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Assignment status after the update:
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>> //0=t;1=tl;
>> /child_default_mon_grp/0=tl;1=l;
>>
>> To update the default group to add a local event domain 0.
>> # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>> //0=tl;1=tl;
>> /child_default_mon_grp/0=tl;1=l;
>>
>> To update the non default CTRL_MON group non_default_ctrl_mon_grp to
>> unassign all
>> the MBM events on all the domains.
>> # echo "non_default_ctrl_mon_grp//*=_" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> non_default_ctrl_mon_grp//0=_;1=_;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>> //0=tl;1=tl;
>> /child_default_mon_grp/0=tl;1=l;
>>
>>
>> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
>> There is no change in reading the events with ABMC. If the event is
>> unassigned
>> when reading, then the read will come back as "Unassigned".
>>
>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> 779247936
>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> 765207488
>>
>> g. Check the bandwidth configuration for the group. Note that bandwidth
>> configuration has a domain scope. Total event defaults to 0x7F (to
>> count all the events) and local event defaults to 0x15 (to count all
>> the local numa events). The event bitmap decoding is available at
>> https://www.kernel.org/doc/Documentation/x86/resctrl.rst
>> in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> 0=0x7f;1=0x7f
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>> 0=0x15;1=0x15
>>
>> h. Change the bandwidth source for domain 0 for the total event to count
>> only reads.
>> Note that this change effects total events on the domain 0.
>>
>> #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> 0=0x33;1=0x7F
>>
>> i. Now read the total event again. The first read will come back with
>> "Unavailable"
>> status. The subsequent read of mbm_total_bytes will display only the
>> read events.
>>
>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> Unavailable
>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> 314101
>>
>> j. Users will have the option to go back to legacy mbm_mode if required.
>> This can be done using the following command. Note that switching the
>> mbm_mode will reset all the mbm counters of all resctrl groups.
>
> "reset all the mbm counters" -> "reset all the MBM counters"
Sure.
>
>>
>> # echo "legacy" > /sys/fs/resctrl/info/L3_MON/mbm_mode
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_mode
>> mbm_cntr_assign
>> [legacy]
>>
>>
>> k. Unmount the resctrl
>>
>> #umount /sys/fs/resctrl/
>> ---
>> v6:
>> We still need to finalize few interface details on mbm_mode and
>> mbm_control
>> in case of ABMC and Soft-ABMC. We can continue the discussion with
>> this series.
>
> Could you please list the details that need to be finalized?
1. mbm_mode display
# cat /sys/fs/resctrl/info/L3_MON/mbm_mode
mbm_cntr_assign
[legacy]
"mbm_cntr_assign"
Are we sticking with ""mbm_cntr_assign" for ABMC?
What should we name for soft-ABMC?
2. Also we had some concerns about Individual event assignment(ABMC)
and group assignment(soft-ABMC)?
Are the flags "t" and 'l' good for both these modes?
>
> Thank you
>
> Reinette
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists