lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Sep 2017 20:47:10 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        Matthias Kaehlcke <mka@...omium.org>,
        Arnd Bergmann <arnd@...db.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>
Subject: Re: [PATCH 2/2] x86/asm: Fix inline asm call constraints for clang

On Tue, Sep 19, 2017 at 03:25:59PM -0700, Alexander Potapenko wrote:
> On Tue, Sep 19, 2017 at 2:55 PM, Alexander Potapenko <glider@...gle.com> wrote:
> > On Tue, Sep 19, 2017 at 11:45 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >> For inline asm statements which have a CALL instruction, we list the
> >> stack pointer as a constraint to convince GCC to ensure the frame
> >> pointer is set up first:
> >>
> >>   static inline void foo()
> >>   {
> >>         register void *__sp asm(_ASM_SP);
> >>         asm("call bar" : "+r" (__sp))
> >>   }
> >>
> >> Unfortunately, that pattern causes clang to corrupt the stack pointer.
> >>
> >> There's actually an easier way to achieve the same goal in GCC, without
> >> causing trouble for clang.  If we declare the stack pointer register
> >> variable as a global variable, and remove the constraint altogether,
> >> that convinces GCC to always set up the frame pointer before inserting
> >> *any* inline asm.
> >>
> >> It basically acts as if *every* inline asm statement has a CALL
> >> instruction.  It's a bit overkill, but the performance impact should be
> >> negligible.
> >>
> >> Here are the vmlinux .text size differences with the following configs
> >> on GCC:
> >>
> >> - defconfig
> >> - defconfig without frame pointers
> >> - Fedora distro config
> >> - Fedora distro config without frame pointers
> >>
> >>         defconfig       defconfig-nofp  distro          distro-nofp
> >> before  9796300         9466764         9076191         8789745
> >> after   9796941         9462859         9076381         8785325
> >>
> >> With frame pointers, the text size increases slightly.  Without frame
> >> pointers, the text size decreases, and a little more significantly.
> >>
> >> Reported-by: Matthias Kaehlcke <mka@...omium.org>
> >> Signed-off-by: Josh Poimboeuf <jpoimboe@...hat.com>
> >> ---
> >>  arch/x86/include/asm/alternative.h               |  3 +--
> >>  arch/x86/include/asm/asm.h                       |  9 ++++++++
> >>  arch/x86/include/asm/mshyperv.h                  | 27 ++++++++++--------------
> >>  arch/x86/include/asm/paravirt_types.h            | 14 ++++++------
> >>  arch/x86/include/asm/preempt.h                   | 15 +++++--------
> >>  arch/x86/include/asm/processor.h                 |  6 ++----
> >>  arch/x86/include/asm/rwsem.h                     |  6 +++---
> >>  arch/x86/include/asm/uaccess.h                   |  5 ++---
> >>  arch/x86/include/asm/xen/hypercall.h             |  5 ++---
> >>  arch/x86/kvm/emulate.c                           |  3 +--
> >>  arch/x86/kvm/vmx.c                               |  4 +---
> >>  arch/x86/mm/fault.c                              |  3 +--
> >>  tools/objtool/Documentation/stack-validation.txt | 20 +++++++++++++-----
> >>  13 files changed, 60 insertions(+), 60 deletions(-)
> >>
> >> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> >> index 1b020381ab38..7aeb1cef4204 100644
> >> --- a/arch/x86/include/asm/alternative.h
> >> +++ b/arch/x86/include/asm/alternative.h
> >> @@ -218,10 +218,9 @@ static inline int alternatives_text_reserved(void *start, void *end)
> >>  #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2,   \
> >>                            output, input...)                                  \
> >>  {                                                                            \
> >> -       register void *__sp asm(_ASM_SP);                                     \
> >>         asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
> >>                 "call %P[new2]", feature2)                                    \
> >> -               : output, "+r" (__sp)                                         \
> >> +               : output                                                      \
> >>                 : [old] "i" (oldfunc), [new1] "i" (newfunc1),                 \
> >>                   [new2] "i" (newfunc2), ## input);                           \
> >>  }
> >> diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
> >> index 676ee5807d86..ff8921d4615b 100644
> >> --- a/arch/x86/include/asm/asm.h
> >> +++ b/arch/x86/include/asm/asm.h
> >> @@ -132,4 +132,13 @@
> >>  /* For C file, we already have NOKPROBE_SYMBOL macro */
> >>  #endif
> >>
> >> +#ifndef __ASSEMBLY__
> >> +/*
> >> + * This variable declaration has the side effect of forcing GCC to always set
> >> + * up the frame pointer before inserting any inline asm.  This is necessary
> >> + * because some inline asm statements have CALL instructions.
> >> + */
> >> +register unsigned int __sp_unused asm("esp");

> > Shouldn't this be "asm(_ASM_SP)"?

> Answering my own question, this can't be _ASM_SP because of the
> realmode code, which is built with -m16 whereas _ASM_SP is "rsp".
> The patch works fine with "esp" though - most certainly declaring a
> ESP variable is enough to reserve the whole RSP in an x86_64 build.

Right, clang failing to build the realmode code was exactly why I did
that.

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ