[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <884d8020-8057-4748-a981-2a6f8f127c45@intel.com>
Date: Tue, 12 Nov 2024 11:24:31 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghua.yu@...el.com>, "Peter
Newman" <peternewman@...gle.com>, Jonathan Corbet <corbet@....net>, Shuah
Khan <skhan@...uxfoundation.org>, <x86@...nel.org>
CC: James Morse <james.morse@....com>, Jamie Iles <quic_jiles@...cinc.com>,
Babu Moger <babu.moger@....com>, Randy Dunlap <rdunlap@...radead.org>,
"Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>,
<linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v8 1/7] x86/resctrl: Prepare for per-ctrl_mon group
mba_MBps control
Hi Tony,
On 10/29/24 10:28 AM, Tony Luck wrote:
> Resctrl uses local memory bandwidth event as input to the feedback
> loop when the mba_MBps mount option is used. This means that this
> mount option cannot be used on systems that only support monitoring
> of total bandwidth.
>
> Prepare to allow users to choose the input event independently for
> each ctrl_mon group.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
> include/linux/resctrl.h | 2 ++
> arch/x86/kernel/cpu/resctrl/internal.h | 2 ++
> arch/x86/kernel/cpu/resctrl/core.c | 5 +++++
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++++
> 4 files changed, 15 insertions(+)
>
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index d94abba1c716..fd05b937e2f4 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -49,6 +49,8 @@ enum resctrl_event_id {
> QOS_L3_MBM_LOCAL_EVENT_ID = 0x03,
> };
>
> +extern enum resctrl_event_id mba_mbps_default_event;
> +
> /**
> * struct resctrl_staged_config - parsed configuration to be applied
> * @new_ctrl: new ctrl value to be loaded
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 955999aecfca..a6f051fb2e69 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -283,6 +283,7 @@ struct pseudo_lock_region {
> * monitor only or ctrl_mon group
> * @mon: mongroup related data
> * @mode: mode of resource group
> + * @mba_mbps_event: input event id when mba_sc mode is active
Referring to mba_sc as a "mode" is potentially confusing when considering the
line right above it (specifically, mba_sc is not a possible value of @mode).
How about "input monitoring event id when MBA software controller (mba_sc) is enabled"
or "input monitoring event id when mba_sc is enabled"?
> * @plr: pseudo-locked region
> */
> struct rdtgroup {
> @@ -295,6 +296,7 @@ struct rdtgroup {
> enum rdt_group_type type;
> struct mongroup mon;
> enum rdtgrp_mode mode;
> + enum resctrl_event_id mba_mbps_event;
> struct pseudo_lock_region *plr;
> };
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index b681c2e07dbf..5b55a7ac7013 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -958,6 +958,11 @@ static __init bool get_rdt_mon_resources(void)
> if (rdt_cpu_has(X86_FEATURE_CQM_MBM_LOCAL))
> rdt_mon_features |= (1 << QOS_L3_MBM_LOCAL_EVENT_ID);
>
> + if (rdt_mon_features & (1 << QOS_L3_MBM_LOCAL_EVENT_ID))
> + mba_mbps_default_event = QOS_L3_MBM_LOCAL_EVENT_ID;
> + else if (rdt_mon_features & (1 << QOS_L3_MBM_TOTAL_EVENT_ID))
> + mba_mbps_default_event = QOS_L3_MBM_TOTAL_EVENT_ID;
> +
> if (!rdt_mon_features)
> return false;
Shouldn't checking of individual monitoring features be at this point? That is,
after confirming that there are indeed monitoring features?
fyi ... since these checks are not architectural I expect that they will
move to resctrl_mon_resource_init() as part of the resctrl fs/arch split.
https://lore.kernel.org/all/20241004180347.19985-17-james.morse@arm.com/
This move will be supported when using the helpers as Fenghua suggested.
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index d7163b764c62..dbfb9d11f3f8 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -65,6 +65,8 @@ static void rdtgroup_destroy_root(void);
>
> struct dentry *debugfs_resctrl;
>
> +enum resctrl_event_id mba_mbps_default_event;
> +
> static bool resctrl_debug;
>
> void rdt_last_cmd_clear(void)
> @@ -2665,6 +2667,8 @@ static int rdt_get_tree(struct fs_context *fc)
> if (ret)
> goto out_schemata_free;
>
> + rdtgroup_default.mba_mbps_event = mba_mbps_default_event;
> +
The pattern of the default group properties are to initialize them once
in rdtgroup_setup_default() and then do any needed reset in rdt_kill_sb().
You can compare with how rdtgroup_default.mode is handled.
If existing pattern is not acceptable then it could be reworked to have one pattern
support all scenarios instead of fragmenting the default group's initialization and
reset.
I do notice that the current MPAM proposal I linked above calls
rdtgroup_setup_default() before resctrl_mon_resource_init() so I think that
needs to be swapped to support this feature.
> kernfs_activate(rdtgroup_default.kn);
>
> ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
> @@ -3624,6 +3628,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
> }
> }
>
> + rdtgrp->mba_mbps_event = mba_mbps_default_event;
> +
Should this be in the "if (resctrl_arch_mon_capable())" section? Looking at rdtgroup_default
I also do not see any check for whether monitoring is supported. Is this intended?
mba_mbps_default_event is only initialized if monitoring is supported so it *seems* as though
the value of 0, which is not a valid value of enum resctrl_event_id is intended as an
"uninitialized"? Unclear how to interpret this implementation at this point. Reading
the changelog again there are no details about these subtleties.
> goto out_unlock;
>
> out_del_list:
Reinette
Powered by blists - more mailing lists