[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjdLY-E3m21_QcHUauakW3qAAOCe2rxzuFEm-Af_oqG0g@mail.gmail.com>
Date: Wed, 6 May 2020 10:16:47 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Nick Desaulniers <ndesaulniers@...gle.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Jason Baron <jbaron@...mai.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>, Nadav Amit <namit@...are.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
"H.J. Lu" <hjl.tools@...il.com>,
clang-built-linux <clang-built-linux@...glegroups.com>
Subject: Re: [PATCH v4 14/18] static_call: Add static_cond_call()
On Wed, May 6, 2020 at 6:51 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> I was hoping for:
>
> bar: # @bar
> movl %edi, .L_x$local(%rip)
> retq
> ponies: # @ponies
> movq .Lfoo$local(%rip), %rax
> testq %rax, %rax
> jz 1f
> jmpq *%rcx # TAILCALL
> 1:
> retq
If you want to just avoid the 'cmov', the best way to do that is to
insert a barrier() on one side of the if-statement.
That breaks the ability to turn the conditional jump into a cmov.
HOWEVER.
It looks like noth clang and gcc will move the indirect jump to the
conditional sites, but then neither of them is smart enough to just
turn the indirect jump into one direct jump.
Strange. So you still get an indirect call for just the "ret" case.
The code looks actively stupid with
gcc:
.L7:
movl $__static_call_nop, %eax
jmp *%rax
clang:
.LBB1_1:
mov eax, offset __static_call_nop
jmp rax # TAILCALL
despite the barrier not being between those two points. The only
difference is the assembler syntax.
Odd. That's such a trivial and obvious optimization. But presumably
it's a pattern that just doesn't happen normally.
Linus
Powered by blists - more mailing lists