lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5e97ff8-9670-40d1-a0fa-69504d34c4c4@citrix.com>
Date: Tue, 29 Apr 2025 03:25:17 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: "H. Peter Anvin" <hpa@...or.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Ingo Molnar <mingo@...nel.org>
Cc: Arnd Bergmann <arnd@...db.de>, Arnd Bergmann <arnd@...nel.org>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
 x86@...nel.org, Juergen Gross <jgross@...e.com>,
 Boris Ostrovsky <boris.ostrovsky@...cle.com>,
 Alexander Usyskin <alexander.usyskin@...el.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Mateusz Jończyk <mat.jonczyk@...pl>,
 Mike Rapoport <rppt@...nel.org>, Ard Biesheuvel <ardb@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org,
 xen-devel@...ts.xenproject.org
Subject: Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case
 handling to C

On 29/04/2025 3:00 am, H. Peter Anvin wrote:
> On April 28, 2025 5:12:13 PM PDT, Andrew Cooper <andrew.cooper3@...rix.com> wrote:
>> On 28/04/2025 10:38 pm, H. Peter Anvin wrote:
>>> On April 28, 2025 9:14:45 AM PDT, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>>>> On Mon, 28 Apr 2025 at 00:05, Ingo Molnar <mingo@...nel.org> wrote:
>>>>> And once we remove 486, I think we can do the optimization below to
>>>>> just assume the output doesn't get clobbered by BS*L in the zero-case,
>>>>> right?
>>>> We probably can't, because who knows what "Pentium" CPU's are out there.
>>>>
>>>> Or even if Pentium really does get it right. I doubt we have any
>>>> developers with an original Pentium around.
>>>>
>>>> So just leave the "we don't know what the CPU result is for zero"
>>>> unless we get some kind of official confirmation.
>>>>
>>>>          Linus
>>> If anyone knows for sure, it is probably Christian Ludloff. However, there was a *huge* tightening of the formal ISA when the i686 was introduced (family=6) and I really believe this was part of it.
>>>
>>> I also really don't trust that family=5 really means conforms to undocumented P5 behavior, e.g. for Quark.
>> https://www.sandpile.org/x86/flags.htm
>>
>> That's a lot of "can't even characterise the result" in the P5.
>>
>> Looking at P4 column, that is clearly what the latest SDM has
>> retroactively declared to be architectural.
>>
>> ~Andrew
> Yes, but it wasn't about flags here. 
>
> Now, question: can we just use __builtin_*() for these? I think gcc should always generate inline code for these on x86.

Yes it does generate inline code.  https://godbolt.org/z/M45oo5rqT

GCC does it branchlessly, but cannot optimise based on context.

Clang can optimise based on context, except the 0 case it seems.

Moving to -march=i686 causes both GCC and Clang to switch to CMOV and
create branchless code, but is still GCC still can't optimise out the
CMOV based on context.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ