lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87wncauslw.ffs@tglx>
Date:   Mon, 18 Jul 2022 21:29:47 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     LKML <linux-kernel@...r.kernel.org>
Cc:     x86@...nel.org, Linus Torvalds <torvalds@...ux-foundation.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Andrew Cooper <Andrew.Cooper3@...rix.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Johannes Wikner <kwikner@...z.ch>,
        Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
        Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
        Joao Moreira <joao.moreira@...el.com>,
        Joseph Nuzman <joseph.nuzman@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juergen Gross <jgross@...e.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

On Sun, Jul 17 2022 at 01:17, Thomas Gleixner wrote:
> The function alignment option does not work for that because it just
> guarantees that the next function entry is aligned, but the padding size
> depends on the position of the last instruction of the previous function
> which might be anything between 0 and padsize-1 obviously, which is not a
> good starting point to put 10 bytes of accounting code into it reliably.
>
> I hacked up GCC to emit such padding and from first experimentation it
> brings quite some performance back.
>
>            	      	 IBRS	    stuff       stuff(pad)
> sockperf 14   bytes: 	 -23.76%    -19.26%     -14.31%
> sockperf 1472 bytes: 	 -22.51%    -18.40%     -12.25%
> microbench:   	     	 +37.20%    +18.46%     +15.47%    
> hackbench:	     	 +21.24%    +10.94%     +10.12%
>
> For FIO I don't have numbers yet, but I expect FIO to get a significant
> gain too.
>
>>>From a quick survey it seems to have no impact for the case where the
> thunks are not used. But that really needs some deep investigation and
> there is a potential conflict with the clang CFI efforts.
>
> The kernel text size increases with a Debian config from 9.9M to 10.4M, so
> about 5%. If the thunk is not 16 byte aligned, the text size increase is
> about 3%, but it turned out that 16 byte aligned is slightly faster.
>
> The 16 byte function alignment turned out to be beneficial in general even
> without the thunks. Not much of an improvement, but measurable. We should
> revisit this independent of these horrors.
>
> The implementation falls back to the allocated thunks when padding is not
> available. I'll send out the GCC patch and the required kernel patch as a
> reply to this series after polishing it a bit.

Here it goes. GCC hackery first.

---
Subject: gcc: Add padding in front of function entry points
From: Thomas Gleixner <tglx@...utronix.de>
Date: Fri, 15 Jul 2022 14:37:53 +0200

For testing purposes:

Add a 16 byte padding filled with int3 in front of each function entry
so the kernel can put call depth accounting into it.

Not-Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
---
 gcc/config/i386/i386.cc  |   11 +++++++++++
 gcc/config/i386/i386.h   |    7 +++++++
 gcc/config/i386/i386.opt |    4 ++++
 gcc/doc/invoke.texi      |    6 ++++++
 4 files changed, 28 insertions(+)

--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -6182,6 +6182,17 @@ ix86_code_end (void)
     file_end_indicate_split_stack ();
 }
 
+void
+x86_asm_output_function_prefix (FILE *asm_out_file,
+				const char *fnname ATTRIBUTE_UNUSED)
+{
+  if (flag_force_function_padding)
+    {
+      fprintf (asm_out_file, "\t.align 16\n");
+      fprintf (asm_out_file, "\t.skip 16,0xcc\n");
+    }
+}
+
 /* Emit code for the SET_GOT patterns.  */
 
 const char *
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2860,6 +2860,13 @@ extern enum attr_cpu ix86_schedule;
 #define LIBGCC2_UNWIND_ATTRIBUTE __attribute__((target ("no-mmx,no-sse")))
 #endif
 
+#include <stdio.h>
+extern void
+x86_asm_output_function_prefix (FILE *asm_out_file,
+				const char *fnname ATTRIBUTE_UNUSED);
+#undef ASM_OUTPUT_FUNCTION_PREFIX
+#define ASM_OUTPUT_FUNCTION_PREFIX x86_asm_output_function_prefix
+
 /*
 Local variables:
 version-control: t
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1064,6 +1064,10 @@ mindirect-branch=
 Target RejectNegative Joined Enum(indirect_branch) Var(ix86_indirect_branch) Init(indirect_branch_keep)
 Convert indirect call and jump to call and return thunks.
 
+mforce-function-padding
+Target Var(flag_force_function_padding) Init(0)
+Put a 16 byte padding area before each function
+
 mfunction-return=
 Target RejectNegative Joined Enum(indirect_branch) Var(ix86_function_return) Init(indirect_branch_keep)
 Convert function return to call and return thunk.
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1451,6 +1451,7 @@ See RS/6000 and PowerPC Options.
 -mindirect-branch=@...{choice}  -mfunction-return=@...{choice} @gol
 -mindirect-branch-register -mharden-sls=@...{choice} @gol
 -mindirect-branch-cs-prefix -mneeded -mno-direct-extern-access}
+-mforce-function-padding @gol
 
 @emph{x86 Windows Options}
 @gccoptlist{-mconsole  -mcygwin  -mno-cygwin  -mdll @gol
@@ -32849,6 +32850,11 @@ Force all calls to functions to be indir
 when using Intel Processor Trace where it generates more precise timing
 information for function calls.
 
+@...m -mforce-function-padding
+@...ndex -mforce-function-padding
+Force a 16 byte padding are before each function which allows run-time
+code patching to put a special prologue before the function entry.
+
 @item -mmanual-endbr
 @opindex mmanual-endbr
 Insert ENDBR instruction at function entry only via the @code{cf_check}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ