[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68ad366a-187e-4b99-a7c2-a7aaf8abbbf3@amd.com>
Date: Mon, 19 Feb 2024 12:00:41 -0600
From: "Moger, Babu" <babu.moger@....com>
To: Peter Newman <peternewman@...gle.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com,
paulmck@...nel.org, rdunlap@...radead.org, tj@...nel.org,
peterz@...radead.org, yanjiewtw@...il.com, kim.phillips@....com,
lukas.bulwahn@...il.com, seanjc@...gle.com, jmattson@...gle.com,
leitao@...ian.org, jpoimboe@...nel.org, rick.p.edgecombe@...el.com,
kirill.shutemov@...ux.intel.com, jithu.joseph@...el.com,
kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, maciej.wieczor-retman@...el.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, eranian@...gle.com,
"james.morse@....com" <james.morse@....com>
Subject: Re: [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Peter,
On 2/16/24 14:18, Peter Newman wrote:
> Hi Babu,
>
> On Thu, Feb 8, 2024 at 9:29 AM Moger, Babu <babu.moger@....com> wrote:
>> On 2/5/24 16:38, Reinette Chatre wrote:
>>> This could be improved beyond a binary "enable"/"disable" interface to user space.
>>> For example, the hardware can discover which "mbm counter assign" related feature
>>> (I'm counting the "soft RMID" here as one of the "mbm counter assign" related
>>> features) is supported on the platform and it can be presented to the user like:
>>>
>>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>>> [feature_1] feature_2 feature_3
>>
>> How about this?
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> ABMC:Capable
>> SOFT-RMID:Capable
>>
>> To enable ABMC
>> # echo ABMC:enable > /sys/fs/resctrl/info/L3_MON/mbm_assign
>>
>> When ABMC is enabled:
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign
>> ABMC:Enable
>> SOFT-RMID:Capable
>
> There would be no need to use soft RMIDs on a system that supports
> ABMC, so I can't think of a reason why the underlying implementation
> would matter to our users. The user should only have to request the
> interface where monitors must be assigned manually. The mount would
> succeed if the system has a way to support the interface.
Ok Sure. I will exclude Soft-rmid for this interface.
For now, lets keep this only for ABMC.
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign
ABMC:Capable
Or
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign
ABMC:Enable
>
>
>>> You have made it clear on several occasions that you do not intend to support
>>> domain level assignment. That may be ok but the interface you create should
>>> not prevent future support of domain level assignment.
>>>
>>> If my point is not clear, could you please share how this interface is able to
>>> support domain level assignment in the future?
>>>
>>> I am starting to think that we need a file similar to the schemata file
>>> for group and domain level monitor configurations.
>>
>> Something like this?
>>
>> By default
>> #cat /sys/fs/resctrl/monitor_state
>> default:0=total=assign,local=assign;1=total=assign,local=assign
>>
>> With ABMC,
>> #cat /sys/fs/resctrl/monitor_state
>> ABMC:0=total=unassign,local=unassign;1=total=unassign,local=unassign
>
> The benefit from all the string parsing in this interface is only
> halving the number of monitor_state sysfs writes we'd need compared to
> creating a separate file for mbm_local and mbm_total. Given that our
> use case is to assign the 32 assignable counters to read the bandwidth
> of ~256 monitoring groups, this isn't a substantial gain to help us. I
> think you should just focus on providing the necessary control
> granularity without trying to consolidate writes in this interface. I
Ok. Looks like we need to provide the interface to assign the RMIDs to
individual domains in this interface. I wasn't planning that now. But, it
can be done without much changes.
Something like this(corrected typos: replaced '=' with '-').
#cat /sys/fs/resctrl/monitor_state
ABMC:0=total-unassign,local-unassign;1=total-unassign,local-unassign
To assign:
#echo "ABMC:0=total-assign,local-assign" > /sys/fs/resctrl/monitor_state
> will propose an additional interface below to optimize our use case.
>
> Whether mbm_total and mbm_local are combined in the group directories
> or not, I don't see why you wouldn't just repeat the same file
> interface in the domain directories for a user needing finer-grained
> controls.
I don't see the need for the same file inside each domain directory in the
group level when the above command can assign the RMIDs per domain.
>
>
>>>> Peter, James,
>>>>
>>>> Please comment on what you want achieve in "assignment" based on the features you are working on.
>
> I prototyped and tested the following additional interface for the
> large-scale, batch use case that we're primarily concerned about:
>
> info/L3_MON/mbm_{local,total}_bytes_assigned
>
> Writing a whitespace-delimited list of mongroup directory paths does
> the following:
> 1. unassign all monitors for the given counter
> 2. assigns a monitor to each mongroup referenced in the write
> 3. batches per-domain register updates resulting from the assignments
> into a single IPI for each domain
>
> This interface allows us to do less sysfs writes and IPIs on systems
> with more assignable monitoring resources, rather than doing more.
>
> The reference to a mongroup when reading/writing the above node is the
> resctrl-root-relative path to the monitoring group. There is probably
> a more concise way to refer to the groups, but my prototype used
> kernfs_walk_and_get() to locate each rdtgroup struct.
>
> I would also like to add that in the software-ABMC prototype I made,
> because it's based on assignment of a small number of RMIDs,
> assignment results in all counters being assigned at once. On
> implementations where per-counter assignments aren't possible,
> assignment through such a resource would be allowed to assign more
> resources than explicitly requested.
>
> This would allow an implementation only capable of global assignment
> to assign resources to all groups when a non-empty string is written
> to the proposed file nodes, and all resources to be unassigned when an
> empty string is written. Reading back from the file nodes would tell
> the user how much was actually assigned.
Yes. This interface can be extended to ABMC as a global assignment option.
If you have your patches ready I can add your patches on top of my ABMC
feature.
Or if you want to add the support later then I will go ahead with current
base ABMC support.
Let me know.
--
Thanks
Babu Moger
Powered by blists - more mailing lists