[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5e9a5b3e-793c-4873-a1d2-33b62614ec31@amd.com>
Date: Thu, 13 Feb 2025 10:19:29 -0600
From: "Moger, Babu" <babu.moger@....com>
To: Dave Martin <Dave.Martin@....com>, peternewman@...gle.com
Cc: corbet@....net, reinette.chatre@...el.com, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
tony.luck@...el.com, x86@...nel.org, hpa@...or.com, paulmck@...nel.org,
akpm@...ux-foundation.org, thuth@...hat.com, rostedt@...dmis.org,
xiongwei.song@...driver.com, pawan.kumar.gupta@...ux.intel.com,
daniel.sneddon@...ux.intel.com, jpoimboe@...nel.org, perry.yuan@....com,
sandipan.das@....com, kai.huang@...el.com, xiaoyao.li@...el.com,
seanjc@...gle.com, xin3.li@...el.com, andrew.cooper3@...rix.com,
ebiggers@...gle.com, mario.limonciello@....com, james.morse@....com,
tan.shaopeng@...itsu.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
eranian@...gle.com
Subject: Re: [PATCH v11 00/23] x86/resctrl : Support AMD Assignable Bandwidth
Monitoring Counters (ABMC)
Hi Dave,
Thanks for your help. Reinette has asked few questions already. I have few
more questions on top of that.
On 2/12/25 11:46, Dave Martin wrote:
> Hi there,
>
> On Wed, Jan 22, 2025 at 02:20:08PM -0600, Babu Moger wrote:
>>
>> This series adds the support for Assignable Bandwidth Monitoring Counters
>> (ABMC). It is also called QoS RMID Pinning feature
>>
>> Series is written such that it is easier to support other assignable
>> features supported from different vendors.
>>
>> The feature details are documented in the APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC). The documentation is available at
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>>
>> The patches are based on top of commit
>> d361b84d51bfe (tip/master) Merge branch into tip/master: 'x86/tdx'
>>
>> # Introduction
>
> [...]
>
>> # Examples
>>
>> a. Check if ABMC support is available
>> #mount -t resctrl resctrl /sys/fs/resctrl/
>>
>> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
>> [mbm_cntr_assign]
>> default
>
> (Nit: can this be called "mbm_counter_assign"? The name is already
> long, so I wonder whether anything is gained by using a cryptic
> abbreviation for "counter". Same with all the "cntrs" elsewhere.
> This is purely cosmetic, though -- the interface works either way.)
Yes. We can do that.
>
>> ABMC feature is detected and it is enabled.
>>
>> b. Check how many ABMC counters are available.
>>
>> # cat /sys/fs/resctrl/info/L3_MON/num_mbm_cntrs
>> 32
>
> Is this file needed?
>
> With MPAM, it is more difficult to promise that the same number of
> counters will be available everywhere.
>
> Rather than lie, or report a "safe" value here that may waste some
> counters, can we just allow the number of counters to be be discovered
> per domain via available_mbm_cntrs?
As Reinette suggested below we can display per domain supported counters
here.
https://lore.kernel.org/lkml/9e849476-7c4b-478b-bd2a-185024def3a3@intel.com/
>
> num_closids and num_rmids are already problematic for MPAM, so it would
> be good to avoid any more parameters of this sort from being reported
> to userspace unless there is a clear understanding of why they are
> needed.
>
> Reporting number of counters per monitoring domain is a more natural
> fit for MPAM, as below:
>
>> c. Check how many ABMC counters are available in each domain.
>>
>> # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
>> 0=30;1=30
>
> For MPAM, this seems supportable. Each monitoring domain will have
> some counters, and a well-defined number of them will be available for
> allocation at any one time.
>
>> d. Create few resctrl groups.
>>
>> # mkdir /sys/fs/resctrl/mon_groups/child_default_mon_grp
>> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp
>> # mkdir /sys/fs/resctrl/non_default_ctrl_mon_grp/mon_groups/child_non_default_mon_grp
>>
>> e. This series adds a new interface file /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>> to list and modify any group's monitoring states. File provides single place
>> to list monitoring states of all the resctrl groups. It makes it easier for
>> user space to learn about the used counters without needing to traverse all
>> the groups thus reducing the number of file system calls.
>>
>> The list follows the following format:
>>
>> "<CTRL_MON group>/<MON group>/<domain_id>=<flags>"
>>
>> Format for specific type of groups:
>>
>> * Default CTRL_MON group:
>> "//<domain_id>=<flags>"
>
> [...]
>
>> Flags can be one of the following:
>>
>> t MBM total event is enabled.
>> l MBM local event is enabled.
>> tl Both total and local MBM events are enabled.
>> _ None of the MBM events are enabled
>>
>> Examples:
>
> [...]
>
> I think that this basically works for MPAM.
>
> The local/total distinction doesn't map in a consistent way onto MPAM,
> but this problem is not specific to ABMC. It feels sensible for ABMC
> to be built around the same concepts that resctrl already has elsewhere
> in the interface. MPAM will do its best to fit (as already).
>
> Regarding Peter's use case of assiging multiple counters to a
> monitoring group [1], I feel that it's probably good enough to make
> sure that the ABMC interface can be extended in future in a backwards
> compatible way so as to support this, without trying to support it
> immediately.
>
> [1] https://lore.kernel.org/lkml/CALPaoCjY-3f2tWvBjuaQPfoPhxveWxxCxHqQMn4BEaeBXBa0bA@mail.gmail.com/
>
>
> For example, if we added new generic "letters" -- say, "0" to "9",
> combined with new counter files in resctrlfs, that feels like a
> possible approach. ABMC (as in this series) should just reject such
> such assignments, and the new counter files wouldn't exist.
What is "combined with new counter files"? Does MPAM going to add new
files to support counter assignment in ARM?
Also what is "0" to "9"? Is this counter ids?
>
> Availability of this feature could also be reported as a distinct mode
> in mbm_assign_mode, say "mbm_cntr_generic", or whatever.
Yes. That should be fine.
>
>
> A _sketch_ of this follows. This is NOT a proposal -- the key
> question is whether we are confident that we can extend the interface
> in this way in the future without breaking anything.
>
> If "yes", then the ABMC interface (as proposed by this series) works as
> a foundation to build on.
>
> --8<--
>
> [artists's impression]
>
> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> mbm_cntr_generic
> [mbm_cntr_assign]
> default
Yes. This looks good.
> # echo mbm_cntr_generic >/sys/fs/resctrl/info/L3_MON/mbm_assign_mode
> # echo '//0=01;1=23' >/sys/fs/resctrl/info/L3_MON/mbm_assign_control
Looks like you are assigning counter ids to domains here. That is
different than ABMC. In ABMC, we assign events (local or total) to the
domain. We internally handle the counter ids based on the availability.
Can MPAM follow the same concept? It is possible?
> # echo t >/sys/fs/resctrl/info/L3_MON/mbm_counter0_bytes_type
> # echo l >/sys/fs/resctrl/info/L3_MON/mbm_counter1_bytes_type
> # echo t >/sys/fs/resctrl/info/L3_MON/mbm_counter2_bytes_type
> # echo l >/sys/fs/resctrl/info/L3_MON/mbm_counter3_bytes_type
This also looks different that we are have right now in resctrl fs.
Are you creating separate file for each counter id in
/sys/fs/resctrl/info/L3_MON/?
>
> ...
>
> # cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_counter1_bytes
>
> etc.
>
> -->8--
>
> Any thoughts on this, Peter?
>
> [...]
>
> Cheers
> ---Dave
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists