[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjpYLLoi1m0VRfVoyzGgmMiNwBhQ0XXG0VWwjskcz5Cug@mail.gmail.com>
Date: Tue, 26 Jul 2022 13:20:23 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Yury Norov <yury.norov@...il.com>, Dennis Zhou <dennis@...nel.org>,
Guenter Roeck <linux@...ck-us.net>,
Catalin Marinas <catalin.marinas@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
linux-m68k@...ts.linux-m68k.org
Subject: Re: Linux 5.19-rc8
On Tue, Jul 26, 2022 at 12:44 PM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> Overall, I would say it's pretty similar (some generic perform
> marginally better, some native perform marginally better) with the
> exception of find_first_bit() being much better with the generic
> implementation, but find_next_zero_bit() being noticably worse.
The generic _find_first_bit() code is actually sane and simple. It
loops over words until it finds a non-zero one, and then does trivial
calculations on that last word.
That explains why the generic code does so much better than your byte-wise asm.
In contrast, the generic _find_next_bit() I find almost offensively
silly - which in turn explains why your byte-wide asm does better.
I think the generic _find_next_bit() should actually do what the m68k
find_next_bit code does: handle the first special word itself, and
then just call find_first_bit() on the rest of it.
And it should *not* try to handle the dynamic "bswap and/or bit sense
invert" thing at all. That should be just four different (trivial)
cases for the first word.
Linus
Powered by blists - more mailing lists