[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55b545fd-2851-0d0f-ac37-ec59838fb4b4@amd.com>
Date: Mon, 4 Mar 2024 13:34:08 -0600
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
James Morse <james.morse@....com>, corbet@....net, fenghua.yu@...el.com,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, Peter Newman <peternewman@...gle.com>
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, maciej.wieczor-retman@...el.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, eranian@...gle.com
Subject: Re: [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Reinette,
On 3/1/2024 5:20 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 3/1/2024 12:36 PM, Moger, Babu wrote:
>> On 2/29/24 15:50, Reinette Chatre wrote:
>>> On 2/29/2024 12:37 PM, Moger, Babu wrote:
> ...
>
>>>> To make more clear, let me list all the groups here based this.
>>>>
>>>> When none of the counters assigned:
>>>>
>>>> $cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>>> resctrl/00=none,none;01=none,none (#default control_mon group)
>>>> resctrl/mon_a/00=none,none;01=none,none (#mon group)
>>>> resctrl/ctrl_a/00=none,none;01=none,none (#control_mon group)
>>>> resctrl/ctrl_a/mon_ab/00=none,none;01=none,none (#mon group)
>>> I am concerned that inconsistent use of "/" will make parsing hard.
>> Do you mean, you don't want to see multiple "/"?
> No. I think that having a consistent number of "/" will be easier to
> parse. In the above example, there are instances of 1, 2, as well as
> three "/" among the lines. That seems complicated to parse.
>
> I was thinking that it will make interpreting and parsing easier if there
> consistently are just always two "/".
>
> (You may find things to be different once you work on the parsing code
> though.)
>
> In summary:
> * for monitoring of default CTRL_MON group: "//<flags>"
> * for MON_GROUP inside default CTRL_MON group: "/<MON group>/<flags>"
> * for monitoring of non-default CTRL_MON group: "<CTRL_MON group>//flags"
> * for MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON group>/<flags>"
>
> What do you think?
Looks like you tried to keep two "/" in all the options. Looks good most
part. Will keep the options open for changes when we start implementing.
>
>> resctrl/ctrl_a/mon_ab/
>>
>> Change to
>>
>> mon_ab/
> rather:
> ctrl_a/mon_ab/<flags>
Sure.
>
>>> I find "resctrl" and all the "none" redundant. It is not clear what
>>> this improves.
>>> Why have:
>>> resctrl/00=none,none;01=none,none
>>> when this could do:
>>> //00=_;01=_
>> ok.
>>
>> "//" meaning root of resctrl filesystem?
> More specifically, monitoring of default control group. It is not intended to
> specify a pathname.
>
>>>> When some counters are assigned:
>>>>
>>>> $echo "resctrl/00=total,local" >
>>>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control (#assigning counter to
>>>> default group)
>>>>
>>>> $echo "resctrl/mon_a/00=total;01=total" >
>>>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control (#assigning counter to mon
>>>> group)
>>>>
>>>> $echo "resctrl/ctrl_a/00=local;01=local" >
>>>> /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>>>
>>>> $echo "resctrl/ctrl_a/mon_ab/00=total,local;01=total,local" >
>>>> /sys/fs/resctrl/nfo/L3_MON/mbm_assign_control
>>>>
>>> We could learn some more lessons from dynamic debug (see Documentation/admin-guide/dynamic-debug-howto.rst). For example, "=" can be used to make an assignment while "+"
>>> can be used to add a counter and "-" can be used to remove a counter.
>>> "=_" can be used to remove counters from all events in that domain.
>> Yes. Looked at dynamic debug. I am still learning this interface. Some examples below based on my understanding.
>>
>> To assign a counters to default group on domain 0.
>> $echo "//00=+lt;01=+lt" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> It should not be necessary to use both "=" and "+" in the same assignment.
> I think of "=" as "assign" and "+" as append ("-" as remove).
Here are our options.
a. assign one event (+)
b. unassign one event (-)
c. assign both (++ may be?)
d. unassign both (_)
I think append ( "=") is not required while assigning. It is confusing.
Assign or Add both involve same action.
How about this? This might be easy to parse. May be space (" ") after
the domain id.
<group>/<domain id> <action><event>; <domain id> <action><event>
>
> An example of this, just focusing on default group.
>
> #hypothetical start state of no counters assigned
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=_;01=_
Looks good.
>
> #assign counter to total MBM of both domains
> $ echo "//00=t;01=t" > mbm_assign_control
There is no difference in assign or add. Just add total MBM event.
$ echo "//00 +t;01 +t" > mbm_assign_control
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=t;01=t
good.
>
> #add counter to local MBM of both domains without impacting total MBM counters
> $echo "//00+l;01+l" > mbm_assign_control
It is not required to know about whether total MBM event is already
assigned or not. Just assign the event of your interest. If it is
already assigned then kernel just ignores it. Kernel has information all
the assignment status.
$echo "//00 +l;01 +l" > mbm_assign_control
We will know the full status of the assignment when we list again.
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=tl;01=tl
Good.
>
> #remove local MBM counters without impacting total MBM counters
> $echo "//00-l;01-l" > mbm_assign_control
Remove local MBM counters. We don't need to know about total MBM counter.
$echo "//00 -l;01 -l" > mbm_assign_control
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=t;01=t
Good.
>
> #assign local MBM counters, removing total MBM counters while doing so
> $echo "//00=l;01=l" > mbm_assign_control
Again confusing here. Just remove total event and add local event in
two commands.
$echo "//00 -t;01 -t" > mbm_assign_control
$echo "//00 +l;01 +l" > mbm_assign_control
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=l;01=l
good
>
> #remove all counters
> $echo "//00=_;01=_" > mbm_assign_control
> $ cat mbm_assign_control
> #control_group/monitor_group/flags
> //00=_;01=_
>
>
>> To assign a counters to mon group inside the default group.
>> $echo "mon_a/00=+t;01=+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> I think it will simplify parsing if number of "/" is consistent:
> $echo "/mon_a/00=t;01=t" > ...
>
>> To assign a counters to control mon group inside the default group.
> It is not clear to me what you mean with this.
>
>> $echo "ctrl_a/00=+l;01=+l" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> this looks similar to previous example, so I think it will be hard for parser
> to know whether it is dealing with control group or monitor group.
> I am not sure I understand your example, but this may perhaps be:
>
> echo "ctrl_a//00=l;01=l > ...
>
>> To assign a counters to control mon group inside another control group.
> I do not know what you mean with "another control group"
>
>> $echo "mon_ab/00=+lt;01=+lt" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_contro
> How will parser know which control group? I was expecting:
> $ echo "<CTRL_MON group>/<MON group>/<flags>"
Sure.
>
>> To unassign a counters to control mon group inside another control group.
>> $echo "mon_ab/00=-lt;01=-lt" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_control
>>
>> To unassign all the counters on a specific group.
>> $echo "mon_ab/00=_" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_control
>>
>> It does not matter control group or mon group. We just need to name of the group in this interface.
> It matters because users can have monitor groups with the same name within
> different control groups.
Agree.
I will list all the example again once we agree on specific format.
Thanks
Babu
Powered by blists - more mailing lists