[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251205-2-v4-0-e5ab932cf219@linux.dev>
Date: Fri, 05 Dec 2025 14:29:03 +0800
From: George Guo <dongtai.guo@...ux.dev>
To: Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>
Cc: loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org,
George Guo <dongtai.guo@...ux.dev>, George Guo <guodongtai@...inos.cn>,
Yangyang Lian <lianyangyang@...inos.cn>
Subject: [PATCH v4 0/4] LoongArch: Add 128-bit atomic cmpxchg support (v4)
This patch series adds 128-bit atomic compare-and-exchange support for
LoongArch architecture, which fixes BPF scheduler test failures caused
by missing 128-bit atomics support.
The series consists of four patches:
1. "LoongArch: Add 128-bit atomic cmpxchg support"
- Implements 128-bit atomic compare-and-exchange using LoongArch's
LL.D/SC.Q instructions
- Fixes BPF scheduler test failures (scx_central scx_qmap) where
kmalloc_nolock_noprof returns NULL due to missing 128-bit atomics,
leading to -ENOMEM errors during scheduler initialization
2. "LoongArch: Enable 128-bit atomics cmpxchg support"
- Adds select HAVE_CMPXCHG_DOUBLE and select HAVE_ALIGNED_STRUCT_PAGE
in Kconfig to enable 128-bit atomic cmpxchg support
3. "LoongArch: Add SCQ support detection"
- Check CPUCFG2_SCQ bit to determin if the CPU supports
SCQ instrction.
4. "LoongArch: Use spinlock to emulate 128-bit cmpxchg"
- For LoongArch CPUs lacking 128-bit atomic instruction(e.g.,
the SCQ instruction on 3A5000), provide a fallback implementation
of __cmpxchg128 using a spinlock to emulate the atomic operation.
The issue was identified through BPF scheduler test failures where
scx_central and scx_qmap schedulers would fail to initialize. Testing
was performed using the scx_qmap scheduler from tools/sched_ext/,
confirming that the patches resolve the initialization failures.
Signed-off-by: George Guo <dongtai.guo@...ux.dev>
---
Changes in v4:
- Add SCQ support detection
- Add spinlock to emulate 128-bit cmpxchg
- Link to v3: https://lore.kernel.org/r/20251126-2-v3-0-851b5a516801@linux.dev
Changes in v3:
- dbar 0 -> __WEAK_LLSC_MB
- =ZB" (__ptr[0]) -> "r" (__ptr)
- Link to v2: https://lore.kernel.org/r/20251124-2-v2-0-b38216e25fd9@linux.dev
Changes in v2:
- Use a normal ld.d for the high word instead of ll.d to avoid race
condition
- Insert a dbar between ll.d and ld.d to prevent reordering
- Simply __cmpxchg128_asm("ll.d", "sc.q", ptr, o, n) to __cmpxchg128_asm(ptr, o, n)
- Fix address operand constraints after testing different approaches:
* ld.d with "m"
* ll.d with "ZC",
* sc.q with "ZB"(alternative constraints caused issues:
- "r" caused system hang
- "ZC" caused compiler error:
{standard input}: Assembler messages:
{standard input}:10037: Fatal error: Immediate overflow.
format: u0:0 )
- Link to v1: https://lore.kernel.org/r/20251120-2-v1-0-705bdc440550@linux.dev
---
George Guo (3):
LoongArch: Add 128-bit atomic cmpxchg support
LoongArch: Use spinlock to emulate 128-bit cmpxchg
LoongArch: Enable 128-bit atomics cmpxchg support
george (1):
LoongArch: Add SCQ support detection
arch/loongarch/Kconfig | 2 +
arch/loongarch/include/asm/cmpxchg.h | 66 +++++++++++++++++++++++++++++++
arch/loongarch/include/asm/cpu-features.h | 1 +
arch/loongarch/include/asm/cpu.h | 2 +
arch/loongarch/include/asm/loongarch.h | 1 +
arch/loongarch/kernel/cpu-probe.c | 4 ++
6 files changed, 76 insertions(+)
---
base-commit: 2061f18ad76ecaddf8ed17df81b8611ea88dbddd
change-id: 20251120-2-d03862b2cf6d
Best regards,
--
George Guo <dongtai.guo@...ux.dev>
Powered by blists - more mailing lists