lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wi=nuDW6yCXSA-dEztZhXNuzLOaH--s_V7GOAE7n6RsRw@mail.gmail.com>
Date: Tue, 29 Apr 2025 14:53:31 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "H. Peter Anvin" <hpa@...or.com>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>, Ingo Molnar <mingo@...nel.org>, 
	Arnd Bergmann <arnd@...db.de>, Arnd Bergmann <arnd@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	Juergen Gross <jgross@...e.com>, Boris Ostrovsky <boris.ostrovsky@...cle.com>, 
	Alexander Usyskin <alexander.usyskin@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Mateusz Jończyk <mat.jonczyk@...pl>, 
	Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	linux-kernel@...r.kernel.org, xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case
 handling to C

On Tue, 29 Apr 2025 at 14:24, H. Peter Anvin <hpa@...or.com> wrote:
>
> Could you file a gcc bug? Gcc shouldn't generate worse code than inline asm ...

Honestly, I've given up on that idea.

That's basically what the bug report I reported about clts was - gcc
generating worse code than inline asm.

But why would we use gcc builtins at all if we have inline asm that is
better than what som eversions of gcc generates? The inline asm works
for all versions.

Anyway, attached is a test file that seems to generate basically
"optimal" code. It's not a kernel patch, but a standalone C file for
testing with a couple of stupid test-cases to make sure that it gets
the constant argument case right, and that it optimizes the "I know
it's not zero" case.

That is why it also uses that "#if __SIZEOF_LONG__ == 4" instead of
something like CONFIG_64BIT.

So it's a proof-of-concept thing, and it does seem to be fairly simple.

The "important" parts are just the

  #define variable_ffs(x) \
        (statically_true((x)!=0) ? __ffs(x)+1 : do_variable_ffs(x))
  #define constant_ffs(x) __builtin_ffs(x)
  #define ffs(x) \
        (__builtin_constant_p(x) ? constant_ffs(x) : variable_ffs(x))

lines which end up picking the right choice. The rest is either the
"reimplement the kernel infrastructure for testing" or the trivial
do_variable_ffs() thing.

I just did

    gcc -m32 -O2 -S -fomit-frame-pointer t.c

(with, and without that -m32) and looked at the result to see that it
looks sane. No *actual* testing.


                Linus

View attachment "t.c" of type "text/x-csrc" (826 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ