lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj0S2vWui0Y+1hpYMEhCiXKexbQ01h+Ckvww8hB29az_A@mail.gmail.com>
Date: Sun, 27 Apr 2025 12:34:46 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Arnd Bergmann <arnd@...db.de>, Ingo Molnar <mingo@...nel.org>, Arnd Bergmann <arnd@...nel.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H. Peter Anvin" <hpa@...or.com>, Juergen Gross <jgross@...e.com>, 
	Boris Ostrovsky <boris.ostrovsky@...cle.com>, 
	Alexander Usyskin <alexander.usyskin@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mateusz Jończyk <mat.jonczyk@...pl>, 
	Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	linux-kernel@...r.kernel.org, xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] [RFC] x86/cpu: rework instruction set selection

On Sun, 27 Apr 2025 at 12:17, Andrew Cooper <andrew.cooper3@...rix.com> wrote:
>
> ffs/fls are commonly found inside loops where x is the loop condition
> too.  Therefore, using statically_true() to provide a form without the
> zero compatibility turns out to be a win.

We already have the version without the zero capability - it's just
called "__ffs()" and "__fls()", and performance-critical code uses
those.

So fls/ffs are the "standard" library functions that have to handle
zero, and add that stupid "+1" because that interface was designed by
some Pascal person who doesn't understand that we start counting from
0.

Standards bodies: "companies aren't sending their best people".

But it's silly that we then spend effort on magic cmov in inline asm
on those things when it's literally the "don't use this version unless
you don't actually care about performance" case.

I don't think it would be wrong to just make the x86-32 code just do
the check against zero ahead of time - in C.

And yes, that will generate some extra code - you'll test for zero
before, and then the caller might also test for a zero result that
then results in another test for zero that can't actually happen (but
the compiler doesn't know that). But I suspect that on the whole, it
is likely to generate better code anyway just because the compiler
sees that first test and can DTRT.

UNTESTED patch applied in case somebody wants to play with this. It
removes 10 lines of silly code, and along with them that 'cmov' use.

Anybody?

              Linus

View attachment "patch.diff" of type "text/x-patch" (1214 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ