lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 May 2020 10:16:47 -0700
From:   Linus Torvalds <>
To:     Peter Zijlstra <>
Cc:     Nick Desaulniers <>,
        Rasmus Villemoes <>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <>,
        LKML <>,
        Steven Rostedt <>,
        Masami Hiramatsu <>,
        Daniel Bristot de Oliveira <>,
        Jason Baron <>,
        Thomas Gleixner <>,
        Ingo Molnar <>, Nadav Amit <>,
        "H. Peter Anvin" <>,
        Andy Lutomirski <>,
        Ard Biesheuvel <>,
        Josh Poimboeuf <>,
        Paolo Bonzini <>,
        Mathieu Desnoyers <>,
        "H.J. Lu" <>,
        clang-built-linux <>
Subject: Re: [PATCH v4 14/18] static_call: Add static_cond_call()

On Wed, May 6, 2020 at 6:51 AM Peter Zijlstra <> wrote:
> I was hoping for:
>         bar:                                    # @bar
>                 movl    %edi, .L_x$local(%rip)
>                 retq
>         ponies:                                 # @ponies
>                 movq    .Lfoo$local(%rip), %rax
>                 testq   %rax, %rax
>                 jz      1f
>                 jmpq    *%rcx                   # TAILCALL
>         1:
>                 retq

If you want to just avoid the 'cmov', the best way to do that is to
insert a barrier() on one side of the if-statement.

That breaks the ability to turn the conditional jump into a cmov.


It looks like noth clang and gcc will move the indirect jump to the
conditional sites, but then neither of them is smart enough to just
turn the indirect jump into one direct jump.

Strange. So you still get an indirect call for just the "ret" case.
The code looks actively stupid with

        movl    $__static_call_nop, %eax
        jmp     *%rax

        mov     eax, offset __static_call_nop
        jmp     rax                     # TAILCALL

despite the barrier not being between those two points. The only
difference is the assembler syntax.

Odd. That's such a trivial and obvious optimization. But presumably
it's a pattern that just doesn't happen normally.


Powered by blists - more mailing lists