lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e54f1943-e0ff-4f59-b24f-9b5a7a38becf@citrix.com>
Date: Sun, 27 Apr 2025 20:17:23 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
 Arnd Bergmann <arnd@...db.de>
Cc: Ingo Molnar <mingo@...nel.org>, Arnd Bergmann <arnd@...nel.org>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
 x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
 Juergen Gross <jgross@...e.com>, Boris Ostrovsky
 <boris.ostrovsky@...cle.com>, Alexander Usyskin
 <alexander.usyskin@...el.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Mateusz Jończyk <mat.jonczyk@...pl>,
 Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
 xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] [RFC] x86/cpu: rework instruction set selection

On 26/04/2025 8:55 pm, Linus Torvalds wrote:
> So I think that manual cmov pattern for x86-32 should be replaced with
>
>         bool zero;
>
>         asm("bsfl %[in],%[out]"
>             CC_SET(z)
>             : CC_OUT(z) (zero),
>               [out]"=r" (r)
>             : [in] "rm" (x));
>
>         return zero ? 0 : r+1;
>
> instead (that's ffs(), and fls() would need the same thing except with
> bsrl insteadm, of course).
>
> I bet that would actually improve code generation.

It is possible to do better still.

ffs/fls are commonly found inside loops where x is the loop condition
too.  Therefore, using statically_true() to provide a form without the
zero compatibility turns out to be a win.

> And I also bet it doesn't actually matter, of course.

Something that neither Linux nor Xen had, which makes a reasonable
difference, is a for_each_set_bit() optimised over a scalar value.  The
existing APIs make it all too easy to spill the loop condition to memory.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ