[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2016b830-64c7-43bd-8116-bdfd239221e3@amd.com>
Date: Mon, 6 May 2024 12:18:19 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [RFC PATCH v3 00/17] x86/resctrl : Support AMD Assignable
Bandwidth Monitoring Counters (ABMC)
Hi Reinette,
On 5/3/24 18:24, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/28/2024 6:06 PM, Babu Moger wrote:
>
>> a. Check if ABMC support is available
>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> [abmc]
>> legacy_mbm
>>
>> Linux kernel detected ABMC feature and it is enabled.
>
> Please note that this adds the "abmc" feature to the resctrl
> *filesystem* that supports more architectures than just AMD. Calling the
> resctrl filesystem feature "abmc" means that (a) AMD needs to be ok with
> other architectures calling their features that are
> similar-but-maybe-not-identical-to-AMD-ABMC "abmc", or (b) this needs
> a new generic name.
It should not a problem if other architecture calling abmc for similar
feature. But generic name is always better if there is a suggestion.
>
>> b. Check how many ABMC counters are available.
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_assign_cntrs
>> 32
>>
>> c. Create few resctrl groups.
>>
>> # mkdir /sys/fs/resctrl/mon_groups/default_mon1
>> # mkdir /sys/fs/resctrl/non_defult_group
>
> Can this be non_default_group instead? Seems like non_defult_group is used
> consistently but its spelling is unexpected.
Ok. Will correct it. Thanks
>
>> # mkdir /sys/fs/resctrl/non_defult_group/mon_groups/non_default_mon1
>>
>> d. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> to list and modify the group's assignment states.
>>
>> The list follows the following format:
>>
>> * Default CTRL_MON group:
>> "//<domain_id>=<assignment_flags>"
>>
>> * Non-default CTRL_MON group:
>> "<CTRL_MON group>//<domain_id>=<assignment_flags>"
>>
>> * Child MON group of default CTRL_MON group:
>> "/<MON group>/<domain_id>=<assignment_flags>"
>>
>> * Child MON group of non-default CTRL_MON group:
>> "<CTRL_MON group>/<MON group>/<domain_id>=<assignment_flags>"
>>
>> Assignment flags can be one of the following:
>>
>> t MBM total event is assigned
>> l MBM local event is assigned
>> tl Both total and local MBM events are assigned
>> _ None of the MBM events are assigned
>>
>> Examples:
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> non_defult_group//0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
>> non_defult_group/non_default_mon1/0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
>> /default_mon1/0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;
>>
>> There are four groups and all the groups have local and total event assigned.
>>
>> "//" - This is a default CONTROL MON group
>>
>> "non_defult_group//" - This is non default CONTROL MON group
>>
>> "/default_mon1/" - This is Child MON group of the defult group
>>
>> "non_defult_group/non_default_mon1/" - This is child MON group of the non default group
>>
>> =tl means both total and local events are assigned.
>>
>> e. Update the group assignment states using the interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control.
>>
>> The write format is similar to the above list format with addition of
>> op-code for the assignment operation.
>>
>> * Default CTRL_MON group:
>> "//<domain_id><op-code><assignment_flags>"
>>
>> * Non-default CTRL_MON group:
>> "<CTRL_MON group>//<domain_id><op-code><assignment_flags>"
>>
>> * Child MON group of default CTRL_MON group:
>> "/<MON group>/<domain_id><op-code><assignment_flags>"
>>
>> * Child MON group of non-default CTRL_MON group:
>> "<CTRL_MON group>/<MON group>/<domain_id><op-code><assignment_flags>"
>>
>> Op-code can be one of the following:
>>
>> = Update the assignment to match the flags
>> + Assign a new state
>> - Unassign a new state
>> _ Unassign all the states
>
> As mentioned in https://lore.kernel.org/lkml/ZjO9hpuLz%2FjJYqvT@e133380.arm.com/
> the "_" is not an operator but instead viewed as an part of <assignment_flags>.
> It is expected to be used with "=", to unset flags it will be used as below:
>
> echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/0=_" ...
Oh.. ok.
Will correct it. I also need to verify the parshing..
>
>>
>>
>> Initial group status:
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=tl;1=tl;
>> /child_default_mon_grp/0=tl;1=tl;
>>
>>
>> To update the default group to assign only total event.
>> # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> //0=t;1=t;
>> /child_default_mon_grp/0=tl;1=tl;
>
> As mentioned in https://lore.kernel.org/lkml/330e3391-b917-4a88-bae3-bdcbb8cfd6f4@intel.com/
> using "0=t" is expected to only impact domain #0, not all domains. Similar for
> other examples below.
Ok. Sure
>
>>
>> To update the MON group child_default_mon_grp to remove local event:
>> # echo "/child_default_mon_grp/0-l" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> //0=t;1=t;
>> /child_default_mon_grp/0=t;1=t;
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>>
>> To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
>> remove both local and total events:
>> # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/0_" >
>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>
>> Assignment status after the update:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> //0=t;1=t;
>> /child_default_mon_grp/0=t;1=t;
>> non_default_ctrl_mon_grp//0=tl;1=tl;
>> non_default_ctrl_mon_grp/child_non_default_mon_grp/0=_;1=_;
>>
>>
>> f. Read the event mbm_total_bytes and mbm_local_bytes of the default group.
>> There is no change in reading the evetns with ABMC. If the event is unassigned
>
> evetns -> events
Sure.
>
>> when reading, then the read will come back as Unavailable.
>>
>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> 779247936
>> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> 765207488
>>
>> g. Users will have the option to go back to legacy_mbm mode if required.
>> This can be done using the following command.
>>
>> # echo "legacy_mbm" > /sys/fs/resctrl/info/L3_MON/mbm_assign
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> abmc
>> [legacy_mbm]
>>
>
> This needs a mention about how state is impacted when a user makes this
> switch. For example, if switching from "legacy" to abmc ... if there
> are fewer than "num counters" monitor groups, will they get counters
> assigned dynamically? What happens to feature specific resctrl files?
> What happens to the counters themselves, are they reset? What else
> happens during this switch?
Sure. Will add the explanation.
When switching from "legacy" to abmc, events in already created resctrl
groups will be in Unassigned states. Users need to assign the monitors to
each group to read the events.
>
>>
>> h. Check the bandwidth configuration for the group. Note that bandwidth
>> configuration has a domain scope. Total event defaults to 0x7F (to
>> count all the events) and local event defaults to 0x15 (to count all
>> the local numa events). The event bitmap decoding is available at
>> https://www.kernel.org/doc/Documentation/x86/resctrl.rst
>> in section "mbm_total_bytes_config", "mbm_local_bytes_config":
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> 0=0x7f;1=0x7f
>>
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
>> 0=0x15;1=0x15
>>
>> j. Change the bandwidth source for domain 0 for the total event to count only reads.
>> Note that this change effects total events on the domain 0.
>>
>> #echo 0=0x33 > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> #cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
>> 0=0x33;1=0x7F
>>
>> k. Now read the total event again. The mbm_total_bytes should display
>> only the read events.
>>
>> #cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> 314101
>>
>> l. Unmount the resctrl
>>
>> #umount /sys/fs/resctrl/
>>
>> ---
>
> Reinette
--
Thanks
Babu Moger
Powered by blists - more mailing lists