[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180405080446.qomyc6ozug3g57gl@gmail.com>
Date: Thu, 5 Apr 2018 10:04:46 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Matthias Kaehlcke <mka@...omium.org>,
Arnd Bergmann <arnd@...db.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
James Y Knight <jyknight@...gle.com>,
Chandler Carruth <chandlerc@...gle.com>,
Stephen Hines <srhines@...gle.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Kees Cook <keescook@...gle.com>,
Guenter Roeck <groeck@...omium.org>,
Greg Hackmann <ghackmann@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [GIT PULL] x86/build changes for v4.17
* Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Apr 04, 2018 at 05:05:25PM -0700, Linus Torvalds wrote:
> > for some reason the test_bit() case looks like
> > this:
> >
> > #define test_bit(nr, addr) \
> > (__builtin_constant_p((nr)) \
> > ? constant_test_bit((nr), (addr)) \
> > : variable_test_bit((nr), (addr)))
> >
> > which is much more straightforward anyway. I'm not quite sure why we
> > did it that odd way anyway, but I bet it's just "hysterical raisins"
> > along with the test_bit() not needing inline asm at all for the
> > constant case.
>
> I always assumed BT was a more expensive instruction than AND with
> immediate.
According to:
http://www.agner.org/optimize/instruction_tables.pdf
The SkyLake costs for 'BT', 'AND' and 'TEST' variants are:
Instruction Operands uops fused uops unfused uops port latency throughput
BT r,r/i 1 1 p06 1 0.5
BT m,r 10 10 5
BT m,i 2 2 p06 p23 0.5
BTR BTS BTC r,r/i 1 1 p06 1 0.5
BTR BTS BTC m,r 10 11 5
BTR BTS BTC m,i 3 4 p06 p4 p23 1
AND OR XOR r,r/i 1 1 p0156 1 0.25
AND OR XOR r,m 1 2 p0156 p23 0.5
AND OR XOR m,r/i 2 4 2p0156 2p237 p4 5 1
TEST r,r/i 1 1 p0156 1 0.25
TEST m,r/i 1 2 p0156 p23 1 0.5
So if I'm reading it right, the relevant comparison would be:
BT m,i 2 2 p06 p23 0.5
AND OR XOR m,r/i 2 4 2p0156 2p237 p4 5 1
?
Thanks,
Ingo
Powered by blists - more mailing lists