lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 May 2017 22:31:17 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Yury Norov <ynorov@...iumnetworks.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Richard Henderson <rth@...ddle.net>,
        Ivan Kokshaysky <ink@...assic.park.msu.ru>,
        Matt Turner <mattst88@...il.com>,
        Akinobu Mita <mita@...aclelinux.com>,
        Mike Galbraith <efault@....de>
Subject: Re: [PATCH] sched: remove sched_find_first_bit()

On Mon, May 15, 2017 at 6:17 PM, Yury Norov <ynorov@...iumnetworks.com> wrote:
> On Mon, May 15, 2017 at 06:06:18PM +0200, Arnd Bergmann wrote:
>> On Mon, May 15, 2017 at 5:47 PM, Yury Norov <ynorov@...iumnetworks.com> wrote:
>> > On Sun, May 14, 2017 at 08:09:17PM +0200, Ingo Molnar wrote:
>> >
>> > I also think that sched_find_first_bit() may be faster that find_first_bit()
>> > because it's inlined in the caller. We can do so for find_first_bit() if
>> > it takes small sizes at compile time, and so all parts of kernel will
>> > use fast find_first_bit, not only sched.
>>
>> I suspect the first step would be to 'select GENERIC_FIND_FIRST_BIT'
>> on ARM64, which should already improve the performance for those
>> files that never call the 'next' variants.
>>
>> Adding an inline version of find_first_{,zero_}bit could also help, but
>> is harder to quantify.
>>
>
> I checked again, and in fact I measured on top of this patch:
> https://lkml.org/lkml/2017/5/13/137
> So find_first_bit is already enabled.

Ok. I've played around with this for a bit more and came to a generic
version that is almost as good as the current sched_find_first_bit()
on x86 (one extra comparison):

+#define sched_find_first_bit(b) find_first_bit(b, 128)
-extern unsigned long find_first_bit(const unsigned long *addr,
+extern unsigned long __find_first_bit(const unsigned long *addr,
                                    unsigned long size);

+static inline unsigned long find_first_bit(const unsigned long *addr,
+                                   unsigned long size)
+{
+       unsigned long idx;
+
+       if (!__builtin_constant_p(size))
+               return __find_first_bit(addr, size);
+
+       idx = 0;
+       switch (size) {
+       case BITS_PER_LONG * 4:
+               if (addr[0])
+                       return __ffs(addr[0]) + idx;
+               addr++;
+               idx += BITS_PER_LONG;
+       case BITS_PER_LONG * 3:
+               if (addr[0])
+                       return __ffs(addr[0]) + idx;
+               addr++;
+               idx += BITS_PER_LONG;
+       case BITS_PER_LONG * 2:
+               if (addr[0])
+                       return __ffs(addr[0]) + idx;
+               addr++;
+               idx += BITS_PER_LONG;
+       case BITS_PER_LONG * 1:
+               if (addr[0])
+                       return __ffs(addr[0]) + idx;
+               addr++;
+               idx += BITS_PER_LONG;
+               return idx;
+       }
+
+       return __find_first_bit(addr, size);
+}

However, on architectures that rely on
include/asm-generic/bitops/__ffs.h or something
similarly verbose, this would just add needless bloat
to the size rather than actually making a difference
in performance.

        Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ