[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ytl2vg15Hc0fh8Dl@worktop.programming.kicks-ass.net>
Date: Thu, 21 Jul 2022 17:54:38 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Sami Tolvanen <samitolvanen@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Joao Moreira <joao@...rdrivepizza.com>,
LKML <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Josh Poimboeuf <jpoimboe@...nel.org>,
"Cooper, Andrew" <andrew.cooper3@...rix.com>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Johannes Wikner <kwikner@...z.ch>,
Alyssa Milburn <alyssa.milburn@...ux.intel.com>,
Jann Horn <jannh@...gle.com>, "H.J. Lu" <hjl.tools@...il.com>,
"Moreira, Joao" <joao.moreira@...el.com>,
"Nuzman, Joseph" <joseph.nuzman@...el.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Gross, Jurgen" <jgross@...e.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Peter Collingbourne <pcc@...gle.com>,
Kees Cook <keescook@...omium.org>
Subject: Re: [patch 00/38] x86/retbleed: Call depth tracking mitigation
On Wed, Jul 20, 2022 at 11:13:16PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 19, 2022 at 10:19:18AM -0700, Sami Tolvanen wrote:
>
> > Clang's current CFI implementation is somewhat similar to this. It
> > creates separate thunks for address-taken functions and changes
> > function addresses in C code to point to the thunks instead.
> >
> > While this works, it creates painful situations when interacting with
> > assembly (e.g. a function address taken in assembly cannot be used
> > for indirect calls in C as it doesn't point to the thunk) and needs
> > unpleasant hacks when we want take the actual function address in C
> > (i.e. scattering the code with function_nocfi() calls).
> >
> > I have to agree with Peter on this, I would rather avoid messing with
> > function pointers in KCFI to avoid these issues.
>
> It is either this; and I think I can avoid the worst of it (see below);
> or grow the indirect_callsites to obscure the immediate (as Linus
> suggested), there's around ~16k indirect callsites in a defconfig-ish
> kernel, so growing it isn't too horrible, but it isn't nice either.
>
> The prettiest option to obscure the immediate at the callsite I could
> conjure up is something like:
>
> kcfi_caller_linus:
> movl $0x12345600, %r10d
> movb $0x78, %r10b
> cmpl %r10d, -OFFSET(%r11)
> je 1f
> ud2
> 1: call __x86_thunk_indirect_r11
>
> Which comes to around 22 bytes (+5 over the original).
My very firstest LLVM patch; except it explodes at runtime and I'm not
sure where to start looking...
On top of sami-llvm/kcfi
If I comment out the orl and cmpl it compiles stuff, put either one back
and it explodes in some very unhelpful message.
The idea is the above callthunk that makes any offset work by not having
the full hash as an immediate and allow kCFI along with
-fpatchable-function-entry=N,M and include M in the offset.
Specifically, I meant to use -fpatchable-function-entry=16,16, but alas,
I never got that far.
Help ?
---
diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index 5e011d409ee8..ffdb95324da7 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -124,23 +124,12 @@ void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF,
OutStreamer->emitSymbolAttribute(FnSym, MCSA_ELF_TypeFunction);
OutStreamer->emitLabel(FnSym);
- // Emit int3 padding to allow runtime patching of the preamble.
- EmitAndCountInstruction(MCInstBuilder(X86::INT3));
- EmitAndCountInstruction(MCInstBuilder(X86::INT3));
-
// Embed the type hash in the X86::MOV32ri instruction to avoid special
// casing object file parsers.
EmitAndCountInstruction(MCInstBuilder(X86::MOV32ri)
.addReg(X86::EAX)
.addImm(Type->getZExtValue()));
- // The type hash is encoded in the last four bytes of the X86::MOV32ri
- // instruction. Emit additional X86::INT3 padding to ensure the hash is
- // at offset -6 from the function start to avoid potential call target
- // gadgets in checks emitted by X86AsmPrinter::LowerKCFI_CHECK.
- EmitAndCountInstruction(MCInstBuilder(X86::INT3));
- EmitAndCountInstruction(MCInstBuilder(X86::INT3));
-
if (MAI->hasDotTypeDotSizeDirective()) {
MCSymbol *EndSym = OutContext.createTempSymbol("cfi_func_end");
OutStreamer->emitLabel(EndSym);
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 16c4d2e45970..d72e82f4f63a 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1340,22 +1340,34 @@ void X86AsmPrinter::LowerKCFI_CHECK(const MachineInstr &MI) {
assert(std::next(MI.getIterator())->isCall() &&
"KCFI_CHECK not followed by a call instruction");
- // The type hash is encoded in the last four bytes of the X86::CMP32mi
- // instruction. If we decided to place the hash immediately before
- // indirect call targets (offset -4), the X86::JCC_1 instruction we'll
- // emit next would be a potential indirect call target as it's preceded
- // by a valid type hash.
- //
- // To avoid generating useful gadgets, X86AsmPrinter::emitKCFITypeId
- // emits the type hash prefix at offset -6, which makes X86::TRAP the
- // only possible target in this instruction sequence.
- EmitAndCountInstruction(MCInstBuilder(X86::CMP32mi)
+ const Function &F = MF->getFunction();
+ unsigned Imm = MI.getOperand(1).getImm();
+ unsigned Num;
+
+ if (F.hasFnAttribute("patchable-function-prefix")) {
+ if (F.getFnAttribute("patchable-function-prefix")
+ .getValueAsString()
+ .getAsInteger(10, Num))
+ Num = 0;
+ }
+
+ // movl $0x12345600, %r10d
+ EmitAndCountInstruction(MCInstBuilder(X86::MOV32ri)
+ .addReg(X86::R10)
+ .addImm(Imm & ~0xff));
+
+ // orl $0x78, %r10d
+ EmitAndCountInstruction(MCInstBuilder(X86::OR32ri8)
+ .addReg(X86::R10)
+ .addImm(Imm & 0xff));
+
+ // cmpl %r10, -off(%r11)
+ EmitAndCountInstruction(MCInstBuilder(X86::CMP32rm)
+ .addReg(X86::R10)
.addReg(MI.getOperand(0).getReg())
.addImm(1)
.addReg(X86::NoRegister)
- .addImm(-6)
- .addReg(X86::NoRegister)
- .addImm(MI.getOperand(1).getImm()));
+ .addImm(-(Num + 4)));
MCSymbol *Pass = OutContext.createTempSymbol();
EmitAndCountInstruction(
Powered by blists - more mailing lists