linux-kernel - Re: Linux 5.19-rc8

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wjpYLLoi1m0VRfVoyzGgmMiNwBhQ0XXG0VWwjskcz5Cug@mail.gmail.com>
Date:   Tue, 26 Jul 2022 13:20:23 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Yury Norov <yury.norov@...il.com>, Dennis Zhou <dennis@...nel.org>,
        Guenter Roeck <linux@...ck-us.net>,
        Catalin Marinas <catalin.marinas@....com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        linux-m68k@...ts.linux-m68k.org
Subject: Re: Linux 5.19-rc8

On Tue, Jul 26, 2022 at 12:44 PM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> Overall, I would say it's pretty similar (some generic perform
> marginally better, some native perform marginally better) with the
> exception of find_first_bit() being much better with the generic
> implementation, but find_next_zero_bit() being noticably worse.

The generic _find_first_bit() code is actually sane and simple. It
loops over words until it finds a non-zero one, and then does trivial
calculations on that last word.

That explains why the generic code does so much better than your byte-wise asm.

In contrast, the generic _find_next_bit() I find almost offensively
silly - which in turn explains why your byte-wide asm does better.

I think the generic _find_next_bit() should actually do what the m68k
find_next_bit code does: handle the first special word itself, and
then just call find_first_bit() on the rest of it.

And it should *not* try to handle the dynamic "bswap and/or bit sense
invert" thing at all. That should be just four different (trivial)
cases for the first word.

             Linus