lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 5 Apr 2018 10:04:46 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Matthias Kaehlcke <mka@...omium.org>,
        Arnd Bergmann <arnd@...db.de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        James Y Knight <jyknight@...gle.com>,
        Chandler Carruth <chandlerc@...gle.com>,
        Stephen Hines <srhines@...gle.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Kees Cook <keescook@...gle.com>,
        Guenter Roeck <groeck@...omium.org>,
        Greg Hackmann <ghackmann@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [GIT PULL] x86/build changes for v4.17


* Peter Zijlstra <peterz@...radead.org> wrote:

> On Wed, Apr 04, 2018 at 05:05:25PM -0700, Linus Torvalds wrote:
> > for some reason the test_bit() case looks like
> > this:
> > 
> >   #define test_bit(nr, addr)                      \
> >         (__builtin_constant_p((nr))             \
> >          ? constant_test_bit((nr), (addr))      \
> >          : variable_test_bit((nr), (addr)))
> > 
> > which is much more straightforward anyway. I'm not quite sure why we
> > did it that odd way anyway, but I bet it's just "hysterical raisins"
> > along with the test_bit() not needing inline asm at all for the
> > constant case.
> 
> I always assumed BT was a more expensive instruction than AND with
> immediate.

According to:

   http://www.agner.org/optimize/instruction_tables.pdf

The SkyLake costs for 'BT', 'AND' and 'TEST' variants are:

         Instruction        Operands      uops fused    uops unfused       uops port    latency throughput
                  BT           r,r/i               1               1             p06          1        0.5
                  BT             m,r              10              10                                     5
                  BT             m,i               2               2         p06 p23                   0.5
         BTR BTS BTC           r,r/i               1               1             p06          1        0.5
         BTR BTS BTC             m,r              10              11                                     5
         BTR BTS BTC             m,i               3               4      p06 p4 p23                     1
          AND OR XOR           r,r/i               1               1           p0156          1       0.25
          AND OR XOR             r,m               1               2       p0156 p23                   0.5
          AND OR XOR           m,r/i               2               4 2p0156 2p237 p4          5          1
                TEST           r,r/i               1               1           p0156          1       0.25
                TEST           m,r/i               1               2       p0156 p23          1        0.5


So if I'm reading it right, the relevant comparison would be:

                  BT             m,i               2               2         p06 p23                   0.5
          AND OR XOR           m,r/i               2               4 2p0156 2p237 p4          5          1

?

Thanks,

	Ingo

Powered by blists - more mailing lists