lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 2 Nov 2021 13:57:44 +0100
From:   Peter Zijlstra <>
To:     Ard Biesheuvel <>
Cc:     Sami Tolvanen <>,
        Mark Rutland <>, X86 ML <>,
        Kees Cook <>,
        Josh Poimboeuf <>,
        Nathan Chancellor <>,
        Nick Desaulniers <>,
        Sedat Dilek <>,
        Steven Rostedt <>,,
        Linux Kernel Mailing List <>,
Subject: Re: [PATCH] static_call,x86: Robustify trampoline patching

On Mon, Nov 01, 2021 at 03:14:41PM +0100, Ard Biesheuvel wrote:
> On Mon, 1 Nov 2021 at 10:05, Peter Zijlstra <> wrote:

> > How is that not true for the jump table approach? Like I showed earlier,
> > it is *trivial* to reconstruct the actual function pointer from a
> > jump-table entry pointer.
> >
> That is not the point. The point is that Clang instruments every
> indirect call that it emits, to check whether the type of the jump
> table entry it is about to call matches the type of the caller. IOW,
> the indirect calls can only branch into jump tables, and all jump
> table entries in a table each branch to the start of some function of
> the same type.
> So the only thing you could achieve by adding or subtracting a
> constant value from the indirect call address is either calling
> another function of the same type (if you are hitting another entry in
> the same table), or failing the CFI type check.

Ah, I see, so the call-site needs to have a branch around the indirect
call instruction.

> Instrumenting the callee only needs something like BTI, and a
> consistent use of the landing pads to ensure that you cannot trivially
> omit the check by landing right after it.

That does bring up another point tho; how are we going to do a kernel
that's optimal for both software CFI and hardware aided CFI?

All questions that need answering I think.

So how insane is something like this, have each function:

	xorl $0xdeadbeef, %r10d
	jz foo
	nop	# make it 16 bytes
	# actual function text goes here

And for each hash have two thunks:

	# arg: r11
	# clobbers: r10, r11
	movl -9(%r11), %r10		# immediate in foo.cfi
	xorl $0xdeadbeef, %r10		# our immediate
	jz 1f
1:	ALTERNATIVE_2	"jmp *%r11",
			"jmp __x86_indirect_thunk_r11", X86_FEATURE_RETPOLINE
			"lfence; jmp *%r11", X86_FEATURE_RETPOLINE_AMD

	# arg: r11
	# clobbers: r10, r11
	movl $0xdeadbeef, %r10
	subq $0x10, %r11
	jmp *%r11

And have the actual indirect callsite look like:

	# r11 - &foo
	ALTERNATIVE_2	"cs call __x86_indirect_thunk_r11",
			"cs call __x86_indirect_cfi_deadbeef", X86_FEATURE_CFI
			"cs call __x86_indirect_ibt_deadbeef", X86_FEATURE_IBT

Although if the compiler were to emit:

	cs call __x86_indirect_cfi_deadbeef

we could probaly fix it up from there.

Then we can at runtime decide between:

  {!cfi, cfi, ibt} x {!retpoline, retpoline, retpoline-amd}

Powered by blists - more mailing lists