lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18b8122d45931e1d84565887f99b76667021d893.camel@xry111.site>
Date: Thu, 20 Nov 2025 16:07:42 +0800
From: Xi Ruoyao <xry111@...111.site>
To: George Guo <dongtai.guo@...ux.dev>, Huacai Chen <chenhuacai@...nel.org>,
  WANG Xuerui <kernel@...0n.name>
Cc: loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org, George Guo
	 <guodongtai@...inos.cn>
Subject: Re: [PATCH 1/2] LoongArch: Add 128-bit atomic cmpxchg support

On Thu, 2025-11-20 at 15:45 +0800, George Guo wrote:
> From: George Guo <guodongtai@...inos.cn>
> 
> Implement 128-bit atomic compare-and-exchange using LoongArch's
> LL.D/SC.Q instructions.
> 
> At the same time, fix BPF scheduler test failures (scx_central scx_qmap)
> caused by kmalloc_nolock_noprof returning NULL due to missing
> 128-bit atomics. The NULL returns led to -ENOMEM errors during
> scheduler initialization, causing test cases to fail.
> 
> Verified by testing with the scx_qmap scheduler (located in
> tools/sched_ext/). Building with `make` and running
> ./tools/sched_ext/build/bin/scx_qmap.
> 
> Signed-off-by: George Guo <guodongtai@...inos.cn>
> ---
>  arch/loongarch/include/asm/cmpxchg.h | 46 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 46 insertions(+)
> 
> diff --git a/arch/loongarch/include/asm/cmpxchg.h b/arch/loongarch/include/asm/cmpxchg.h
> index 979fde61bba8a42cb4f019f13ded2a3119d4aaf4..5f8d418595cf62ec3153dd3825d80ac1fb31e883 100644
> --- a/arch/loongarch/include/asm/cmpxchg.h
> +++ b/arch/loongarch/include/asm/cmpxchg.h
> @@ -111,6 +111,43 @@ __arch_xchg(volatile void *ptr, unsigned long x, int size)
>  	__ret;								\
>  })
>  
> +union __u128_halves {
> +	u128 full;
> +	struct {
> +		u64 low;
> +		u64 high;
> +	};
> +};
> +
> +#define __cmpxchg128_asm(ld, st, ptr, old, new)				\
> +({									\
> +	union __u128_halves __old, __new, __ret;			\
> +	volatile u64 *__ptr = (volatile u64 *)(ptr);			\
> +									\
> +	__old.full = (old);                                             \
> +	__new.full = (new);						\
> +									\
> +	__asm__ __volatile__(						\
> +	"1:   " ld "  %0, %4          # 128-bit cmpxchg low  \n"	\
> +	"     " ld "  %1, %5          # 128-bit cmpxchg high \n"	\

This is incorrect.  It may happen that:

      SMP 1        |        SMP 2
ll.d $r4, mem      |
                   |   sc.q $t0, $t1, mem
ll.d $r5, mem + 4  |

As the second ll.d instruction raises the LL bit, you lose the info if
the first ll.d instruction has succeeded.  Thus you cannot figure out if
someone has modified the mem during your "critical section."

You should use a normal ld.d for the high word instead.  And you need to
insert a dbar between ll.d and ld.d to prevent reordering.

-- 
Xi Ruoyao <xry111@...111.site>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ