[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<OSZPR01MB87987088A6A033D03458AD188B85A@OSZPR01MB8798.jpnprd01.prod.outlook.com>
Date: Thu, 8 Jan 2026 10:06:33 +0000
From: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>
To: Ben Horgan <ben.horgan@....com>
CC: "amitsinght@...vell.com" <amitsinght@...vell.com>,
"baisheng.gao@...soc.com" <baisheng.gao@...soc.com>,
"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
"carl@...amperecomputing.com" <carl@...amperecomputing.com>,
"dave.martin@....com" <dave.martin@....com>, "david@...nel.org"
<david@...nel.org>, "dfustini@...libre.com" <dfustini@...libre.com>,
"fenghuay@...dia.com" <fenghuay@...dia.com>, "gshan@...hat.com"
<gshan@...hat.com>, "james.morse@....com" <james.morse@....com>,
"jonathan.cameron@...wei.com" <jonathan.cameron@...wei.com>,
"kobak@...dia.com" <kobak@...dia.com>, "lcherian@...vell.com"
<lcherian@...vell.com>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "peternewman@...gle.com"
<peternewman@...gle.com>, "punit.agrawal@....qualcomm.com"
<punit.agrawal@....qualcomm.com>, "quic_jiles@...cinc.com"
<quic_jiles@...cinc.com>, "reinette.chatre@...el.com"
<reinette.chatre@...el.com>, "rohit.mathew@....com" <rohit.mathew@....com>,
"scott@...amperecomputing.com" <scott@...amperecomputing.com>,
"sdonthineni@...dia.com" <sdonthineni@...dia.com>, "xhao@...ux.alibaba.com"
<xhao@...ux.alibaba.com>, "catalin.marinas@....com"
<catalin.marinas@....com>, "will@...nel.org" <will@...nel.org>,
"corbet@....net" <corbet@....net>, "maz@...nel.org" <maz@...nel.org>,
"oupton@...nel.org" <oupton@...nel.org>, "joey.gouly@....com"
<joey.gouly@....com>, "suzuki.poulose@....com" <suzuki.poulose@....com>,
"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>
Subject: Re: [PATCH v2 07/45] arm64: mpam: Context switch the MPAM registers
Hello Ben,
> From: James Morse <james.morse@....com>
>
> MPAM allows traffic in the SoC to be labeled by the OS, these labels are
> used to apply policy in caches and bandwidth regulators, and to monitor
> traffic in the SoC. The label is made up of a PARTID and PMG value. The x86
> equivalent calls these CLOSID and RMID, but they don't map precisely.
>
> MPAM has two CPU system registers that is used to hold the PARTID and PMG
> values that traffic generated at each exception level will use. These can
> be set per-task by the resctrl file system. (resctrl is the defacto
> interface for controlling this stuff).
>
> Add a helper to switch this.
>
> struct task_struct's separate CLOSID and RMID fields are insufficient to
> implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and
> PMG (sort of like the RMID) separately. On x86, the rmid is an independent
> number, so a race that writes a mismatched closid and rmid into hardware is
> benign. On arm64, the pmg bits extend the partid.
> (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0). In
> this case, mismatching the values will 'dirty' a pmg value that resctrl
> believes is clean, and is not tracking with its 'limbo' code.
>
> To avoid this, the partid and pmg are always read and written as a pair.
> Instead of making struct task_struct's closid and rmid fields an
> endian-unsafe union, add the value to struct thread_info and always use
> READ_ONCE()/WRITE_ONCE() when accessing this field.
>
> Resctrl allows a per-cpu 'default' value to be set, this overrides the
> values when scheduling a task in the default control-group, which has
> PARTID 0. The way 'code data prioritisation' gets emulated means the
> register value for the default group needs to be a variable.
>
> The current system register value is kept in a per-cpu variable to avoid
> writing to the system register if the value isn't going to change. Writes
> to this register may reset the hardware state for regulating bandwidth.
>
> Finally, there is no reason to context switch these registers unless there
> is a driver changing the values in struct task_struct. Hide the whole thing
> behind a static key. This also allows the driver to disable MPAM in
> response to errors reported by hardware. Move the existing static key to
> belong to the arch code, as in the future the MPAM driver may become a
> loadable module.
>
> All this should depend on whether there is an MPAM driver, hide it behind
> CONFIG_ARM64_MPAM.
>
> CC: Amit Singh Tomar <amitsinght@...vell.com>
> Signed-off-by: James Morse <james.morse@....com>
> Signed-off-by: Ben Horgan <ben.horgan@....com>
> ---
> CONFIG_MPAM -> CONFIG_ARM64_MPAM in commit message
> Remove extra DECLARE_STATIC_KEY_FALSE
> Function name in comment, __mpam_sched_in() -> mpam_thread_switch()
> Remove unused headers
> Expand comment (Jonathan)
> ---
> arch/arm64/Kconfig | 2 +
> arch/arm64/include/asm/mpam.h | 73 ++++++++++++++++++++++++++++
> arch/arm64/include/asm/thread_info.h | 3 ++
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/mpam.c | 13 +++++
> arch/arm64/kernel/process.c | 7 +++
> drivers/resctrl/mpam_devices.c | 2 -
> drivers/resctrl/mpam_internal.h | 4 +-
> 8 files changed, 101 insertions(+), 4 deletions(-)
> create mode 100644 arch/arm64/include/asm/mpam.h
> create mode 100644 arch/arm64/kernel/mpam.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 93173f0a09c7..cdcc5b76a110 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2049,6 +2049,8 @@ config ARM64_MPAM
>
> MPAM is exposed to user-space via the resctrl pseudo filesystem.
>
> + This option enables the extra context switch code.
> +
> endmenu # "ARMv8.4 architectural features"
>
> menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
> new file mode 100644
> index 000000000000..2ab3dca6977c
> --- /dev/null
> +++ b/arch/arm64/include/asm/mpam.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (C) 2025 Arm Ltd. */
> +
> +#ifndef __ASM__MPAM_H
> +#define __ASM__MPAM_H
> +
> +#include <linux/jump_label.h>
> +#include <linux/percpu.h>
> +#include <linux/sched.h>
> +
> +#include <asm/sysreg.h>
> +
> +DECLARE_STATIC_KEY_FALSE(mpam_enabled);
> +DECLARE_PER_CPU(u64, arm64_mpam_default);
> +DECLARE_PER_CPU(u64, arm64_mpam_current);
> +
> +/*
> + * The value of the MPAM0_EL1 sysreg when a task is in resctrl's default group.
> + * This is used by the context switch code to use the resctrl CPU property
> + * instead. The value is modified when CDP is enabled/disabled by mounting
> + * the resctrl filesystem.
> + */
> +extern u64 arm64_mpam_global_default;
> +
> +/*
> + * The resctrl filesystem writes to the partid/pmg values for threads and CPUs,
> + * which may race with reads in mpam_thread_switch(). Ensure only one of the old
> + * or new values are used. Particular care should be taken with the pmg field as
> + * mpam_thread_switch() may read a partid and pmg that don't match, causing this
> + * value to be stored with cache allocations, despite being considered 'free' by
> + * resctrl.
> + *
> + * A value in struct thread_info is used instead of struct task_struct as the
> + * cpu's u64 register format is used. In struct task_struct there are two u32,
> + * rmid and closid for the x86 case, but as we can't use them here do something
> + * else. Creating a union would mean only accesses from the created u64 would be
> + * endian safe and so be less clear.
> + */
> +static inline u64 mpam_get_regval(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_ARM64_MPAM
> + return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
> +#else
> + return 0;
> +#endif
> +}
> +
> +static inline void mpam_thread_switch(struct task_struct *tsk)
> +{
> + u64 oldregval;
> + int cpu = smp_processor_id();
> + u64 regval = mpam_get_regval(tsk);
> +
> + if (!IS_ENABLED(CONFIG_ARM64_MPAM) ||
> + !static_branch_likely(&mpam_enabled))
> + return;
> +
> + if (regval == READ_ONCE(arm64_mpam_global_default))
> + regval = READ_ONCE(per_cpu(arm64_mpam_default, cpu));
> +
> + oldregval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
> + if (oldregval == regval)
> + return;
> +
> + write_sysreg_s(regval, SYS_MPAM1_EL1);
> + isb();
> +
> + /* Synchronising the EL0 write is left until the ERET to EL0 */
> + write_sysreg_s(regval, SYS_MPAM0_EL1);
> +
> + WRITE_ONCE(per_cpu(arm64_mpam_current, cpu), regval);
> +}
How about changing the code as follows? (Refer to "mte_thread_switch(next);" in "arch/arm64/kernel/process.c")
static inline u64 mpam_get_regval(struct task_struct *tsk)
{
-#ifdef CONFIG_ARM64_MPAM
return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
-#else
- return 0;
-#endif
}
+#ifdef CONFIG_ARM64_MPAM
static inline void mpam_thread_switch(struct task_struct *tsk)
{
u64 oldregval;
int cpu = smp_processor_id();
u64 regval = mpam_get_regval(tsk);
- if (!IS_ENABLED(CONFIG_ARM64_MPAM) ||
- !static_branch_likely(&mpam_enabled))
+ if (!static_branch_likely(&mpam_enabled))
return;
if (regval == READ_ONCE(arm64_mpam_global_default))
@@ -101,4 +96,8 @@ static inline void mpam_thread_switch(struct task_struct *tsk)
WRITE_ONCE(per_cpu(arm64_mpam_current, cpu), regval);
}
+#else
+static inline void mpam_thread_switch(struct task_struct *tsk){}
+#endif
+
Best regards,
Shaopeng TAN
> +#endif /* __ASM__MPAM_H */
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index a803b887b0b4..fc801a26ff9e 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -41,6 +41,9 @@ struct thread_info {
> #ifdef CONFIG_SHADOW_CALL_STACK
> void *scs_base;
> void *scs_sp;
> +#endif
> +#ifdef CONFIG_ARM64_MPAM
> + u64 mpam_partid_pmg;
> #endif
> u32 cpu;
> };
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 76f32e424065..15979f366519 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -67,6 +67,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
> obj-$(CONFIG_VMCORE_INFO) += vmcore_info.o
> obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o
> obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
> +obj-$(CONFIG_ARM64_MPAM) += mpam.o
> obj-$(CONFIG_ARM64_MTE) += mte.o
> obj-y += vdso-wrap.o
> obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
> diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
> new file mode 100644
> index 000000000000..9866d2ca0faa
> --- /dev/null
> +++ b/arch/arm64/kernel/mpam.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2025 Arm Ltd. */
> +
> +#include <asm/mpam.h>
> +
> +#include <linux/jump_label.h>
> +#include <linux/percpu.h>
> +
> +DEFINE_STATIC_KEY_FALSE(mpam_enabled);
> +DEFINE_PER_CPU(u64, arm64_mpam_default);
> +DEFINE_PER_CPU(u64, arm64_mpam_current);
> +
> +u64 arm64_mpam_global_default;
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index fba7ca102a8c..b510c0699313 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -51,6 +51,7 @@
> #include <asm/fpsimd.h>
> #include <asm/gcs.h>
> #include <asm/mmu_context.h>
> +#include <asm/mpam.h>
> #include <asm/mte.h>
> #include <asm/processor.h>
> #include <asm/pointer_auth.h>
> @@ -737,6 +738,12 @@ struct task_struct *__switch_to(struct task_struct *prev,
> if (prev->thread.sctlr_user != next->thread.sctlr_user)
> update_sctlr_el1(next->thread.sctlr_user);
>
> + /*
> + * MPAM thread switch happens after the DSB to ensure prev's accesses
> + * use prev's MPAM settings.
> + */
> + mpam_thread_switch(next);
> +
> /* the actual thread switch */
> last = cpu_switch_to(prev, next);
>
> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
> index b495d5291868..860181266b15 100644
> --- a/drivers/resctrl/mpam_devices.c
> +++ b/drivers/resctrl/mpam_devices.c
> @@ -29,8 +29,6 @@
>
> #include "mpam_internal.h"
>
> -DEFINE_STATIC_KEY_FALSE(mpam_enabled); /* This moves to arch code */
> -
> /*
> * mpam_list_lock protects the SRCU lists when writing. Once the
> * mpam_enabled key is enabled these lists are read-only,
> diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
> index aaaf31ca9210..e6e7ba4342d6 100644
> --- a/drivers/resctrl/mpam_internal.h
> +++ b/drivers/resctrl/mpam_internal.h
> @@ -16,12 +16,12 @@
> #include <linux/srcu.h>
> #include <linux/types.h>
>
> +#include <asm/mpam.h>
> +
> #define MPAM_MSC_MAX_NUM_RIS 16
>
> struct platform_device;
>
> -DECLARE_STATIC_KEY_FALSE(mpam_enabled);
> -
> #ifdef CONFIG_MPAM_KUNIT_TEST
> #define PACKED_FOR_KUNIT __packed
> #else
> --
> 2.43.0
Powered by blists - more mailing lists