linux-kernel - Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ytp6ooFQMLr3/h6v@worktop.programming.kicks-ass.net>
Date:   Fri, 22 Jul 2022 12:23:30 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Sami Tolvanen <samitolvanen@...gle.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Joao Moreira <joao@...rdrivepizza.com>,
        LKML <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        "Cooper, Andrew" <andrew.cooper3@...rix.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Johannes Wikner <kwikner@...z.ch>,
        Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
        Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
        "Moreira, Joao" <joao.moreira@...el.com>,
        "Nuzman, Joseph" <joseph.nuzman@...el.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        "Gross, Jurgen" <jgross@...e.com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Peter Collingbourne <pcc@...gle.com>,
        Kees Cook <keescook@...omium.org>
Subject: Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation

On Thu, Jul 21, 2022 at 05:16:14PM -0700, Sami Tolvanen wrote:

> That looks good to me. I updated my LLVM tree to generate this code
> for the checks:
> 
> https://github.com/samitolvanen/llvm-project/commits/kcfi

Thanks!

The alignment thing you added:

  // Emit int3 padding before the type information to maintain alignment.
  // The X86::MOV32ri instruction we emit is 5 bytes long.
  uint64_t Padding = offsetToAlignment(5, MF.getAlignment());
  while (Padding--)
    EmitAndCountInstruction(MCInstBuilder(X86::INT3));

Doesn't seem to quite do what we want though.

When I use -fpatchable-function-entry=16,16 we effectively get a 32 byte
prefix on every function:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
       0:       cc                      int3
       1:       cc                      int3
       2:       cc                      int3
       3:       cc                      int3
       4:       cc                      int3
       5:       cc                      int3
       6:       cc                      int3
       7:       cc                      int3
       8:       cc                      int3
       9:       cc                      int3
       a:       cc                      int3
       b:       b8 26 b1 df 98          mov    $0x98dfb126,%eax
      10:       90                      nop
      11:       90                      nop
      12:       90                      nop
      13:       90                      nop
      14:       90                      nop
      15:       90                      nop
      16:       90                      nop
      17:       90                      nop
      18:       90                      nop
      19:       90                      nop
      1a:       90                      nop
      1b:       90                      nop
      1c:       90                      nop
      1d:       90                      nop
      1e:       90                      nop
      1f:       90                      nop

And given the parameters, that's indeed the only option. However, given
I can scribble the type thing just fine when moving to FineIBT and the
whole Skylake depth tracking only needs 10 bytes, I figured I'd try:
-fpatchable-function-entry=11,11 instead. But that resulted in
unalignment:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
       0:       cc                      int3
       1:       cc                      int3
       2:       cc                      int3
       3:       cc                      int3
       4:       cc                      int3
       5:       cc                      int3
       6:       cc                      int3
       7:       cc                      int3
       8:       cc                      int3
       9:       cc                      int3
       a:       cc                      int3
       b:       b8 26 b1 df 98          mov    $0x98dfb126,%eax
      10:       90                      nop
      11:       90                      nop
      12:       90                      nop
      13:       90                      nop
      14:       90                      nop
      15:       90                      nop
      16:       90                      nop
      17:       90                      nop
      18:       90                      nop
      19:       90                      nop
      1a:       90                      nop

000000000000001b <__traceiter_sched_kthread_stop>:

However, if I change clang like so:

 llvm/lib/Target/X86/X86AsmPrinter.cpp | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 789597f8ef1a..6c94313a197d 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -124,9 +124,15 @@ void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF,
     OutStreamer->emitSymbolAttribute(FnSym, MCSA_ELF_TypeFunction);
   OutStreamer->emitLabel(FnSym);
 
+  int64_t PrefixNops = 0;
+  (void)MF.getFunction()
+      .getFnAttribute("patchable-function-prefix")
+      .getValueAsString()
+      .getAsInteger(10, PrefixNops);
+
   // Emit int3 padding before the type information to maintain alignment.
   // The X86::MOV32ri instruction we emit is 5 bytes long.
-  uint64_t Padding = offsetToAlignment(5, MF.getAlignment());
+  uint64_t Padding = offsetToAlignment(5+PrefixNops, MF.getAlignment());
   while (Padding--)
     EmitAndCountInstruction(MCInstBuilder(X86::INT3));
 

Then it becomes:

0000000000000000 <__cfi___traceiter_sched_kthread_stop>:
       0:       b8 26 b1 df 98          mov    $0x98dfb126,%eax
       5:       90                      nop
       6:       90                      nop
       7:       90                      nop
       8:       90                      nop
       9:       90                      nop
       a:       90                      nop
       b:       90                      nop
       c:       90                      nop
       d:       90                      nop
       e:       90                      nop
       f:       90                      nop

0000000000000010 <__traceiter_sched_kthread_stop>:

and things are 'good' again, except for functions that don't get a kcfi
preamble, those are unaligned... I couldn't find where the
patchable-function-prefix nops are generated to fix this up :/


Also; could you perhaps add a switch to supress ENDBR for functions with
a kCFI preamble ?