lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aA8oqKUaFU-0wb-D@gmail.com>
Date: Mon, 28 Apr 2025 09:05:12 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>,
	Arnd Bergmann <arnd@...db.de>, Arnd Bergmann <arnd@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H. Peter Anvin" <hpa@...or.com>, Juergen Gross <jgross@...e.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	Alexander Usyskin <alexander.usyskin@...el.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Mateusz Jończyk <mat.jonczyk@...pl>,
	Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
	xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case
 handling to C


* Ingo Molnar <mingo@...nel.org> wrote:

> > UNTESTED patch applied in case somebody wants to play with this. It
> > removes 10 lines of silly code, and along with them that 'cmov' use.
> > 
> > Anybody?
> 
> Makes sense - it seems to boot here, but I only did some very light 
> testing.
> 
> There's a minor text size increase on x86-32 defconfig, GCC 14.2.0:
> 
>       text       data        bss         dec        hex    filename
>   16577728    7598826    1744896    25921450    18b87aa    vmlinux.before
>   16577908    7598838    1744896    25921642    18b886a    vmlinux.after
> 
> bloatometer output:
> 
>   add/remove: 2/1 grow/shrink: 201/189 up/down: 5681/-3486 (2195)

And once we remove 486, I think we can do the optimization below to 
just assume the output doesn't get clobbered by BS*L in the zero-case, 
right?

In the text size space it's a substantial optimization on x86-32 
defconfig:

        text	   data	       bss	     dec	    hex	filename
  16,577,728    7598826    1744896      25921450        18b87aa vmlinux.vanilla      # CMOV+BS*L
  16,577,908	7598838	   1744896	25921642	18b886a	vmlinux.linus_patch  # if()+BS*L
  16,573,568	7602922	   1744896	25921386	18b876a	vmlinux.noclobber    # BS*L

Thanks,

	Ingo

---
 arch/x86/include/asm/bitops.h | 20 ++------------------
 1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 6061c87f14ac..e3e94a806656 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -308,24 +308,16 @@ static __always_inline int variable_ffs(int x)
 {
 	int r;
 
-#ifdef CONFIG_X86_64
 	/*
 	 * AMD64 says BSFL won't clobber the dest reg if x==0; Intel64 says the
 	 * dest reg is undefined if x==0, but their CPU architect says its
 	 * value is written to set it to the same as before, except that the
 	 * top 32 bits will be cleared.
-	 *
-	 * We cannot do this on 32 bits because at the very least some
-	 * 486 CPUs did not behave this way.
 	 */
 	asm("bsfl %1,%0"
 	    : "=r" (r)
 	    : ASM_INPUT_RM (x), "0" (-1));
-#else
-	if (!x)
-		return 0;
-	asm("bsfl %1,%0" : "=r" (r) : "rm" (x));
-#endif
+
 	return r + 1;
 }
 
@@ -360,24 +352,16 @@ static __always_inline int fls(unsigned int x)
 	if (__builtin_constant_p(x))
 		return x ? 32 - __builtin_clz(x) : 0;
 
-#ifdef CONFIG_X86_64
 	/*
 	 * AMD64 says BSRL won't clobber the dest reg if x==0; Intel64 says the
 	 * dest reg is undefined if x==0, but their CPU architect says its
 	 * value is written to set it to the same as before, except that the
 	 * top 32 bits will be cleared.
-	 *
-	 * We cannot do this on 32 bits because at the very least some
-	 * 486 CPUs did not behave this way.
 	 */
 	asm("bsrl %1,%0"
 	    : "=r" (r)
 	    : ASM_INPUT_RM (x), "0" (-1));
-#else
-	if (!x)
-		return 0;
-	asm("bsrl %1,%0" : "=r" (r) : "rm" (x));
-#endif
+
 	return r + 1;
 }
 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ