lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190716230255.2o6w3kj6hk33vpiw@treble>
Date:   Tue, 16 Jul 2019 18:02:55 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Nick Desaulniers <ndesaulniers@...gle.com>
Cc:     Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Arnd Bergmann <arnd@...db.de>, Jann Horn <jannh@...gle.com>,
        Randy Dunlap <rdunlap@...radead.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH 10/22] bpf: Disable GCC -fgcse optimization for
 ___bpf_prog_run()

On Tue, Jul 16, 2019 at 11:15:54AM -0700, Nick Desaulniers wrote:
> On Sun, Jul 14, 2019 at 5:37 PM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >
> > On x86-64, with CONFIG_RETPOLINE=n, GCC's "global common subexpression
> > elimination" optimization results in ___bpf_prog_run()'s jumptable code
> > changing from this:
> >
> >         select_insn:
> >                 jmp *jumptable(, %rax, 8)
> >                 ...
> >         ALU64_ADD_X:
> >                 ...
> >                 jmp *jumptable(, %rax, 8)
> >         ALU_ADD_X:
> >                 ...
> >                 jmp *jumptable(, %rax, 8)
> >
> > to this:
> >
> >         select_insn:
> >                 mov jumptable, %r12
> >                 jmp *(%r12, %rax, 8)
> >                 ...
> >         ALU64_ADD_X:
> >                 ...
> >                 jmp *(%r12, %rax, 8)
> >         ALU_ADD_X:
> >                 ...
> >                 jmp *(%r12, %rax, 8)
> >
> > The jumptable address is placed in a register once, at the beginning of
> > the function.  The function execution can then go through multiple
> > indirect jumps which rely on that same register value.  This has a few
> > issues:
> >
> > 1) Objtool isn't smart enough to be able to track such a register value
> >    across multiple recursive indirect jumps through the jump table.
> >
> > 2) With CONFIG_RETPOLINE enabled, this optimization actually results in
> >    a small slowdown.  I measured a ~4.7% slowdown in the test_bpf
> >    "tcpdump port 22" selftest.
> >
> >    This slowdown is actually predicted by the GCC manual:
> >
> >      Note: When compiling a program using computed gotos, a GCC
> >      extension, you may get better run-time performance if you
> >      disable the global common subexpression elimination pass by
> >      adding -fno-gcse to the command line.
> >
> > So just disable the optimization for this function.
> >
> > Fixes: e55a73251da3 ("bpf: Fix ORC unwinding in non-JIT BPF code")
> > Reported-by: Randy Dunlap <rdunlap@...radead.org>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
> > Acked-by: Alexei Starovoitov <ast@...nel.org>
> > ---
> > Cc: Alexei Starovoitov <ast@...nel.org>
> > Cc: Daniel Borkmann <daniel@...earbox.net>
> > ---
> >  include/linux/compiler-gcc.h   | 2 ++
> >  include/linux/compiler_types.h | 4 ++++
> >  kernel/bpf/core.c              | 2 +-
> >  3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index e8579412ad21..d7ee4c6bad48 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -170,3 +170,5 @@
> >  #else
> >  #define __diag_GCC_8(s)
> >  #endif
> > +
> > +#define __no_fgcse __attribute__((optimize("-fno-gcse")))
> 
> + Miguel, maintainer of compiler_attributes.h
> I wonder if the optimize attributes can be feature detected?
> Is -fno-gcse supported all the way back to GCC 4.6?

Yeah, from snooping in the GCC tree it looks like it's been around
for 18+ years.

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ