lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <OSZPR01MB87987088A6A033D03458AD188B85A@OSZPR01MB8798.jpnprd01.prod.outlook.com>
Date: Thu, 8 Jan 2026 10:06:33 +0000
From: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@...itsu.com>
To: Ben Horgan <ben.horgan@....com>
CC: "amitsinght@...vell.com" <amitsinght@...vell.com>,
	"baisheng.gao@...soc.com" <baisheng.gao@...soc.com>,
	"baolin.wang@...ux.alibaba.com" <baolin.wang@...ux.alibaba.com>,
	"carl@...amperecomputing.com" <carl@...amperecomputing.com>,
	"dave.martin@....com" <dave.martin@....com>, "david@...nel.org"
	<david@...nel.org>, "dfustini@...libre.com" <dfustini@...libre.com>,
	"fenghuay@...dia.com" <fenghuay@...dia.com>, "gshan@...hat.com"
	<gshan@...hat.com>, "james.morse@....com" <james.morse@....com>,
	"jonathan.cameron@...wei.com" <jonathan.cameron@...wei.com>,
	"kobak@...dia.com" <kobak@...dia.com>, "lcherian@...vell.com"
	<lcherian@...vell.com>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "peternewman@...gle.com"
	<peternewman@...gle.com>, "punit.agrawal@....qualcomm.com"
	<punit.agrawal@....qualcomm.com>, "quic_jiles@...cinc.com"
	<quic_jiles@...cinc.com>, "reinette.chatre@...el.com"
	<reinette.chatre@...el.com>, "rohit.mathew@....com" <rohit.mathew@....com>,
	"scott@...amperecomputing.com" <scott@...amperecomputing.com>,
	"sdonthineni@...dia.com" <sdonthineni@...dia.com>, "xhao@...ux.alibaba.com"
	<xhao@...ux.alibaba.com>, "catalin.marinas@....com"
	<catalin.marinas@....com>, "will@...nel.org" <will@...nel.org>,
	"corbet@....net" <corbet@....net>, "maz@...nel.org" <maz@...nel.org>,
	"oupton@...nel.org" <oupton@...nel.org>, "joey.gouly@....com"
	<joey.gouly@....com>, "suzuki.poulose@....com" <suzuki.poulose@....com>,
	"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>
Subject: Re: [PATCH v2 07/45] arm64: mpam: Context switch the MPAM registers

Hello Ben,

> From: James Morse <james.morse@....com>
> 
> MPAM allows traffic in the SoC to be labeled by the OS, these labels are
> used to apply policy in caches and bandwidth regulators, and to monitor
> traffic in the SoC. The label is made up of a PARTID and PMG value. The x86
> equivalent calls these CLOSID and RMID, but they don't map precisely.
> 
> MPAM has two CPU system registers that is used to hold the PARTID and PMG
> values that traffic generated at each exception level will use. These can
> be set per-task by the resctrl file system. (resctrl is the defacto
> interface for controlling this stuff).
> 
> Add a helper to switch this.
> 
> struct task_struct's separate CLOSID and RMID fields are insufficient to
> implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and
> PMG (sort of like the RMID) separately. On x86, the rmid is an independent
> number, so a race that writes a mismatched closid and rmid into hardware is
> benign. On arm64, the pmg bits extend the partid.
> (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0).  In
> this case, mismatching the values will 'dirty' a pmg value that resctrl
> believes is clean, and is not tracking with its 'limbo' code.
> 
> To avoid this, the partid and pmg are always read and written as a pair.
> Instead of making struct task_struct's closid and rmid fields an
> endian-unsafe union, add the value to struct thread_info and always use
> READ_ONCE()/WRITE_ONCE() when accessing this field.
> 
> Resctrl allows a per-cpu 'default' value to be set, this overrides the
> values when scheduling a task in the default control-group, which has
> PARTID 0. The way 'code data prioritisation' gets emulated means the
> register value for the default group needs to be a variable.
> 
> The current system register value is kept in a per-cpu variable to avoid
> writing to the system register if the value isn't going to change.  Writes
> to this register may reset the hardware state for regulating bandwidth.
> 
> Finally, there is no reason to context switch these registers unless there
> is a driver changing the values in struct task_struct. Hide the whole thing
> behind a static key. This also allows the driver to disable MPAM in
> response to errors reported by hardware. Move the existing static key to
> belong to the arch code, as in the future the MPAM driver may become a
> loadable module.
> 
> All this should depend on whether there is an MPAM driver, hide it behind
> CONFIG_ARM64_MPAM.
> 
> CC: Amit Singh Tomar <amitsinght@...vell.com>
> Signed-off-by: James Morse <james.morse@....com>
> Signed-off-by: Ben Horgan <ben.horgan@....com>
> ---
> CONFIG_MPAM -> CONFIG_ARM64_MPAM in commit message
> Remove extra DECLARE_STATIC_KEY_FALSE
> Function name in comment, __mpam_sched_in() -> mpam_thread_switch()
> Remove unused headers
> Expand comment (Jonathan)
> ---
>  arch/arm64/Kconfig                   |  2 +
>  arch/arm64/include/asm/mpam.h        | 73 ++++++++++++++++++++++++++++
>  arch/arm64/include/asm/thread_info.h |  3 ++
>  arch/arm64/kernel/Makefile           |  1 +
>  arch/arm64/kernel/mpam.c             | 13 +++++
>  arch/arm64/kernel/process.c          |  7 +++
>  drivers/resctrl/mpam_devices.c       |  2 -
>  drivers/resctrl/mpam_internal.h      |  4 +-
>  8 files changed, 101 insertions(+), 4 deletions(-)
>  create mode 100644 arch/arm64/include/asm/mpam.h
>  create mode 100644 arch/arm64/kernel/mpam.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 93173f0a09c7..cdcc5b76a110 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2049,6 +2049,8 @@ config ARM64_MPAM
>  
>            MPAM is exposed to user-space via the resctrl pseudo filesystem.
>  
> +         This option enables the extra context switch code.
> +
>  endmenu # "ARMv8.4 architectural features"
>  
>  menu "ARMv8.5 architectural features"
> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
> new file mode 100644
> index 000000000000..2ab3dca6977c
> --- /dev/null
> +++ b/arch/arm64/include/asm/mpam.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (C) 2025 Arm Ltd. */
> +
> +#ifndef __ASM__MPAM_H
> +#define __ASM__MPAM_H
> +
> +#include <linux/jump_label.h>
> +#include <linux/percpu.h>
> +#include <linux/sched.h>
> +
> +#include <asm/sysreg.h>
> +
> +DECLARE_STATIC_KEY_FALSE(mpam_enabled);
> +DECLARE_PER_CPU(u64, arm64_mpam_default);
> +DECLARE_PER_CPU(u64, arm64_mpam_current);
> +
> +/*
> + * The value of the MPAM0_EL1 sysreg when a task is in resctrl's default group.
> + * This is used by the context switch code to use the resctrl CPU property
> + * instead. The value is modified when CDP is enabled/disabled by mounting
> + * the resctrl filesystem.
> + */
> +extern u64 arm64_mpam_global_default;
> +
> +/*
> + * The resctrl filesystem writes to the partid/pmg values for threads and CPUs,
> + * which may race with reads in mpam_thread_switch(). Ensure only one of the old
> + * or new values are used. Particular care should be taken with the pmg field as
> + * mpam_thread_switch() may read a partid and pmg that don't match, causing this
> + * value to be stored with cache allocations, despite being considered 'free' by
> + * resctrl.
> + *
> + * A value in struct thread_info is used instead of struct task_struct as the
> + * cpu's u64 register format is used. In struct task_struct there are two u32,
> + * rmid and closid for the x86 case, but as we can't use them here do something
> + * else. Creating a union would mean only accesses from the created u64 would be
> + * endian safe and so be less clear.
> + */
> +static inline u64 mpam_get_regval(struct task_struct *tsk)
> +{
> +#ifdef CONFIG_ARM64_MPAM
> +       return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
> +#else
> +       return 0;
> +#endif
> +}
> +
> +static inline void mpam_thread_switch(struct task_struct *tsk)
> +{
> +       u64 oldregval;
> +       int cpu = smp_processor_id();
> +       u64 regval = mpam_get_regval(tsk);
> +
> +       if (!IS_ENABLED(CONFIG_ARM64_MPAM) ||
> +           !static_branch_likely(&mpam_enabled))
> +               return;
> +
> +       if (regval == READ_ONCE(arm64_mpam_global_default))
> +               regval = READ_ONCE(per_cpu(arm64_mpam_default, cpu));
> +
> +       oldregval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
> +       if (oldregval == regval)
> +               return;
> +
> +       write_sysreg_s(regval, SYS_MPAM1_EL1);
> +       isb();
> +
> +       /* Synchronising the EL0 write is left until the ERET to EL0 */
> +       write_sysreg_s(regval, SYS_MPAM0_EL1);
> +
> +       WRITE_ONCE(per_cpu(arm64_mpam_current, cpu), regval);
> +}

How about changing the code as follows? (Refer to "mte_thread_switch(next);" in "arch/arm64/kernel/process.c")

 static inline u64 mpam_get_regval(struct task_struct *tsk)
 {
-#ifdef CONFIG_ARM64_MPAM
        return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
-#else
-       return 0;
-#endif
 }

+#ifdef CONFIG_ARM64_MPAM
 static inline void mpam_thread_switch(struct task_struct *tsk)
 {
        u64 oldregval;
        int cpu = smp_processor_id();
        u64 regval = mpam_get_regval(tsk);

-       if (!IS_ENABLED(CONFIG_ARM64_MPAM) ||
-           !static_branch_likely(&mpam_enabled))
+       if (!static_branch_likely(&mpam_enabled))
                return;

        if (regval == READ_ONCE(arm64_mpam_global_default))
@@ -101,4 +96,8 @@ static inline void mpam_thread_switch(struct task_struct *tsk)

        WRITE_ONCE(per_cpu(arm64_mpam_current, cpu), regval);
 }
+#else
+static inline void mpam_thread_switch(struct task_struct *tsk){}
+#endif
+
Best regards,
Shaopeng TAN

> +#endif /* __ASM__MPAM_H */
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index a803b887b0b4..fc801a26ff9e 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -41,6 +41,9 @@ struct thread_info {
>  #ifdef CONFIG_SHADOW_CALL_STACK
>          void                    *scs_base;
>          void                    *scs_sp;
> +#endif
> +#ifdef CONFIG_ARM64_MPAM
> +       u64                     mpam_partid_pmg;
>  #endif
>          u32                     cpu;
>  };
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 76f32e424065..15979f366519 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -67,6 +67,7 @@ obj-$(CONFIG_CRASH_DUMP)              += crash_dump.o
>  obj-$(CONFIG_VMCORE_INFO)               += vmcore_info.o
>  obj-$(CONFIG_ARM_SDE_INTERFACE)         += sdei.o
>  obj-$(CONFIG_ARM64_PTR_AUTH)            += pointer_auth.o
> +obj-$(CONFIG_ARM64_MPAM)               += mpam.o
>  obj-$(CONFIG_ARM64_MTE)                 += mte.o
>  obj-y                                   += vdso-wrap.o
>  obj-$(CONFIG_COMPAT_VDSO)               += vdso32-wrap.o
> diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
> new file mode 100644
> index 000000000000..9866d2ca0faa
> --- /dev/null
> +++ b/arch/arm64/kernel/mpam.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2025 Arm Ltd. */
> +
> +#include <asm/mpam.h>
> +
> +#include <linux/jump_label.h>
> +#include <linux/percpu.h>
> +
> +DEFINE_STATIC_KEY_FALSE(mpam_enabled);
> +DEFINE_PER_CPU(u64, arm64_mpam_default);
> +DEFINE_PER_CPU(u64, arm64_mpam_current);
> +
> +u64 arm64_mpam_global_default;
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index fba7ca102a8c..b510c0699313 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -51,6 +51,7 @@
>  #include <asm/fpsimd.h>
>  #include <asm/gcs.h>
>  #include <asm/mmu_context.h>
> +#include <asm/mpam.h>
>  #include <asm/mte.h>
>  #include <asm/processor.h>
>  #include <asm/pointer_auth.h>
> @@ -737,6 +738,12 @@ struct task_struct *__switch_to(struct task_struct *prev,
>          if (prev->thread.sctlr_user != next->thread.sctlr_user)
>                  update_sctlr_el1(next->thread.sctlr_user);
>  
> +       /*
> +        * MPAM thread switch happens after the DSB to ensure prev's accesses
> +        * use prev's MPAM settings.
> +        */
> +       mpam_thread_switch(next);
> +
>          /* the actual thread switch */
>          last = cpu_switch_to(prev, next);
>  
> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
> index b495d5291868..860181266b15 100644
> --- a/drivers/resctrl/mpam_devices.c
> +++ b/drivers/resctrl/mpam_devices.c
> @@ -29,8 +29,6 @@
>  
>  #include "mpam_internal.h"
>  
> -DEFINE_STATIC_KEY_FALSE(mpam_enabled); /* This moves to arch code */
> -
>  /*
>   * mpam_list_lock protects the SRCU lists when writing. Once the
>   * mpam_enabled key is enabled these lists are read-only,
> diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
> index aaaf31ca9210..e6e7ba4342d6 100644
> --- a/drivers/resctrl/mpam_internal.h
> +++ b/drivers/resctrl/mpam_internal.h
> @@ -16,12 +16,12 @@
>  #include <linux/srcu.h>
>  #include <linux/types.h>
>  
> +#include <asm/mpam.h>
> +
>  #define MPAM_MSC_MAX_NUM_RIS    16
>  
>  struct platform_device;
>  
> -DECLARE_STATIC_KEY_FALSE(mpam_enabled);
> -
>  #ifdef CONFIG_MPAM_KUNIT_TEST
>  #define PACKED_FOR_KUNIT __packed
>  #else
> --
> 2.43.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ