[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181013193335.GD31650@zn.tnic>
Date:   Sat, 13 Oct 2018 21:33:35 +0200
From:   Borislav Petkov <bp@...en8.de>
To:     Segher Boessenkool <segher@...nel.crashing.org>
Cc:     Ingo Molnar <mingo@...nel.org>, Richard Biener <rguenther@...e.de>,
        Michael Matz <matz@...e.de>, gcc@....gnu.org,
        Nadav Amit <namit@...are.com>, Ingo Molnar <mingo@...hat.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Sam Ravnborg <sam@...nborg.org>,
        Alok Kataria <akataria@...are.com>,
        Christopher Li <sparse@...isli.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "H. Peter Anvin" <hpa@...or.com>, Jan Beulich <JBeulich@...e.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Juergen Gross <jgross@...e.com>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        Kees Cook <keescook@...omium.org>,
        linux-sparse@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Philippe Ombredanne <pombredanne@...b.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        virtualization@...ts.linux-foundation.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Chris Zankel <chris@...kel.net>,
        Max Filippov <jcmvbkbc@...il.com>,
        linux-xtensa@...ux-xtensa.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: PROPOSAL: Extend inline asm syntax with size spec
Ok,
with Segher's help I've been playing with his patch ontop of bleeding
edge gcc 9 and here are my observations. Please double-check me for
booboos so that they can be addressed while there's time.
So here's what I see ontop of 4.19-rc7:
First marked the alternative asm() as inline and undeffed the "inline"
keyword. I need to do that because of the funky games we do with
"inline" redefinitions in include/linux/compiler_types.h.
And Segher hinted at either doing:
asm volatile inline(...
or
asm volatile __inline__(...
but both "inline" variants are defined as macros in that file.
Which means we either need to #undef inline before using it in asm() or
come up with something cleverer.
Anyway, did this:
---
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 4cd6a3b71824..7c0639087da7 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -165,11 +165,13 @@ static inline int alternatives_text_reserved(void *start, void *end)
  * For non barrier like inlines please define new variants
  * without volatile and memory clobber.
  */
+
+#undef inline
 #define alternative(oldinstr, newinstr, feature)                       \
-       asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
+       asm volatile inline(ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 #define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
-       asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory")
+       asm volatile inline(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory")
 
 /*
  * Alternative inline assembly with input.
---
Build failed at link time with:
arch/x86/boot/compressed/cmdline.o: In function `native_save_fl':
cmdline.c:(.text+0x0): multiple definition of `native_save_fl'
arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1e0): first defined here
arch/x86/boot/compressed/cmdline.o: In function `native_restore_fl':
cmdline.c:(.text+0x10): multiple definition of `native_restore_fl'
arch/x86/boot/compressed/misc.o:misc.c:(.text+0x1f0): first defined here
arch/x86/boot/compressed/error.o: In function `native_save_fl':
error.c:(.text+0x0): multiple definition of `native_save_fl'
which I had to fix with this:
---
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 15450a675031..0d772598c37c 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -14,8 +14,7 @@
  */
 
 /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */
-extern inline unsigned long native_save_fl(void);
-extern inline unsigned long native_save_fl(void)
+static inline unsigned long native_save_fl(void)
 {
        unsigned long flags;
 
@@ -33,8 +32,7 @@ ex
---
That "extern inline" declaration looks fishy to me anyway, maybe not really
needed ? Apparently, gcc < 4.9 complains with -Werror=missing-prototypes...
Then the build worked and the results look like this:
   text    data     bss     dec     hex filename
17287384        5040656 2019532 24347572        17383b4 vmlinux-before
17288020        5040656 2015436 24344112        1737630 vmlinux-2nd-version
so some inlining must be happening.
Then I did this:
---
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index 9c5606d88f61..a0170344cf08 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -20,7 +20,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size)
        /* no memory constraint because it doesn't change any memory gcc knows
           about */
        stac();
-       asm volatile(
+       asm volatile inline(
                "       testq  %[size8],%[size8]\n"
                "       jz     4f\n"
                "0:     movq $0,(%[dst])\n"
---
to force inlining of a somewhat bigger asm() statement. And yap, more
got inlined:
   text    data     bss     dec     hex filename
17287384        5040656 2019532 24347572        17383b4 vmlinux-before
17288020        5040656 2015436 24344112        1737630 vmlinux-2nd-version
17288076        5040656 2015436 24344168        1737668 vmlinux-2nd-version__clear_user
so more stuff gets inlined.
Looking at the asm output, it had before:
---
clear_user:
# ./arch/x86/include/asm/current.h:15:  return this_cpu_read_stable(current_task);
#APP
# 15 "./arch/x86/include/asm/current.h" 1
        movq %gs:current_task,%rax      #, pfo_ret__
# 0 "" 2
# arch/x86/lib/usercopy_64.c:51:        if (access_ok(VERIFY_WRITE, to, n))
#NO_APP
        movq    2520(%rax), %rdx        # pfo_ret___12->thread.addr_limit.seg, _1
        movq    %rdi, %rax      # to, tmp93
        addq    %rsi, %rax      # n, tmp93
        jc      .L3     #,
# arch/x86/lib/usercopy_64.c:51:        if (access_ok(VERIFY_WRITE, to, n))
        cmpq    %rax, %rdx      # tmp93, _1
        jb      .L3     #,
# arch/x86/lib/usercopy_64.c:52:                return __clear_user(to, n);
        jmp     __clear_user    #
---
note the JMP to __clear_user. After marking the asm() in __clear_user() as
inline, clear_user() inlines __clear_user() directly:
---
clear_user:
# ./arch/x86/include/asm/current.h:15:  return this_cpu_read_stable(current_task);
#APP
# 15 "./arch/x86/include/asm/current.h" 1
        movq %gs:current_task,%rax      #, pfo_ret__
# 0 "" 2
# arch/x86/lib/usercopy_64.c:51:        if (access_ok(VERIFY_WRITE, to, n))
#NO_APP
        movq    2520(%rax), %rdx        # pfo_ret___12->thread.addr_limit.seg, _1
        movq    %rdi, %rax      # to, tmp95
        addq    %rsi, %rax      # n, tmp95
        jc      .L8     #,
# arch/x86/lib/usercopy_64.c:51:        if (access_ok(VERIFY_WRITE, to, n))
        cmpq    %rax, %rdx      # tmp95, _1
        jb      .L8     #,
# ./arch/x86/include/asm/smap.h:58:     alternative("", __stringify(__ASM_STAC), X86_FEATURE_SMAP);
...
this last line is the stac() macro which gets inlined due to the
alternative() inlined macro above.
So I guess this all looks like what we wanted.
And if this lands in gcc9, we would need to do a asm_volatile() macro
which is defined differently based on the compiler used.
Thoughts, suggestions, etc are most welcome.
Thx.
-- 
Regards/Gruss,
    Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists
 
