lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240626010924.478611-2-torvalds@linux-foundation.org>
Date: Tue, 25 Jun 2024 18:09:25 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Uros Bizjak <ubizjak@...il.com>,
	Ingo Molnar <mingo@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	x86@...nel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	kernel test robot <lkp@...el.com>
Subject: [PATCH] x86-32: fix cmpxchg8b_emu build error with clang

The kernel test robot reported that clang no longer compiles the 32-bit
x86 kernel in some configurations due to commit 95ece48165c1
("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}()
functions").

The build fails with

  arch/x86/include/asm/cmpxchg_32.h:149:9: error: inline assembly requires more registers than available

and the reason seems to be that not only does the cmpxchg8b instruction
need four fixed registers (EDX:EAX and ECX:EBX), with the emulation
fallback the inline asm also wants a fifth fixed register for the
address (it uses %esi for that, but that's just a software convention
with cmpxchg8b_emu).

Avoiding using another pointer input to the asm (and just forcing it to
use the "0(%esi)" addressing that we end up requiring for the sw
fallback) seems to fix the issue.

Reported-by: kernel test robot <lkp@...el.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202406230912.F6XFIyA6-lkp@intel.com/
Fixes: 95ece48165c1 ("locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions")
Link: https://lore.kernel.org/all/202406230912.F6XFIyA6-lkp@intel.com/
Suggested-by: Uros Bizjak <ubizjak@...il.com>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
---

Added commit message, and updated the asm to use '%a[ptr]' instead of
writing out the addressing by hand. 

Still doing the 'oldp' writeback unconmditionally.  The code generation
for the case I checked were the same for both clang and gcc, but until
Uros hits me with the big clue-hammer, I think it's the simpler code
that leaves room for potentially better optimizations too. 

This falls solidly in the "looks ok to me, but still untested" category
for me.  It fixes the clang build issue in my build testing, but I no
longer have a 32-bit test environment, so no actual runtime testing.

 arch/x86/include/asm/cmpxchg_32.h | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index ed2797f132ce..4444a8292c7a 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -88,18 +88,17 @@ static __always_inline bool __try_cmpxchg64_local(volatile u64 *ptr, u64 *oldp,
 
 #define __arch_cmpxchg64_emu(_ptr, _old, _new, _lock_loc, _lock)	\
 ({									\
-	union __u64_halves o = { .full = (_old), },			\
-			   n = { .full = (_new), };			\
+	__u64 o = (_old);						\
+	union __u64_halves n = { .full = (_new), };			\
 									\
 	asm volatile(ALTERNATIVE(_lock_loc				\
 				 "call cmpxchg8b_emu",			\
-				 _lock "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \
-		     : [ptr] "+m" (*(_ptr)),				\
-		       "+a" (o.low), "+d" (o.high)			\
-		     : "b" (n.low), "c" (n.high), "S" (_ptr)		\
+				 _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \
+		     : "+A" (o)						\
+		     : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr)	\
 		     : "memory");					\
 									\
-	o.full;								\
+	o;								\
 })
 
 static __always_inline u64 arch_cmpxchg64(volatile u64 *ptr, u64 old, u64 new)
@@ -116,22 +115,19 @@ static __always_inline u64 arch_cmpxchg64_local(volatile u64 *ptr, u64 old, u64
 
 #define __arch_try_cmpxchg64_emu(_ptr, _oldp, _new, _lock_loc, _lock)	\
 ({									\
-	union __u64_halves o = { .full = *(_oldp), },			\
-			   n = { .full = (_new), };			\
+	__u64 o = *(_oldp);						\
+	union __u64_halves n = { .full = (_new), };			\
 	bool ret;							\
 									\
 	asm volatile(ALTERNATIVE(_lock_loc				\
 				 "call cmpxchg8b_emu",			\
-				 _lock "cmpxchg8b %[ptr]", X86_FEATURE_CX8) \
+				 _lock "cmpxchg8b %a[ptr]", X86_FEATURE_CX8) \
 		     CC_SET(e)						\
-		     : CC_OUT(e) (ret),					\
-		       [ptr] "+m" (*(_ptr)),				\
-		       "+a" (o.low), "+d" (o.high)			\
-		     : "b" (n.low), "c" (n.high), "S" (_ptr)		\
+		     : CC_OUT(e) (ret), "+A" (o)			\
+		     : "b" (n.low), "c" (n.high), [ptr] "S" (_ptr)	\
 		     : "memory");					\
 									\
-	if (unlikely(!ret))						\
-		*(_oldp) = o.full;					\
+	*(_oldp) = o;							\
 									\
 	likely(ret);							\
 })
-- 
2.45.1.209.gc6f12300df


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ