linux-hardening - Re: [RFC PATCH 01/11] x86: kernel FineIBT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Ynj5WLYnuWs/3oZW@hirez.programming.kicks-ass.net>
Date:   Mon, 9 May 2022 13:22:00 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Kees Cook <keescook@...omium.org>
Cc:     Peter Collingbourne <pcc@...gle.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Joao Moreira <joao@...rdrivepizza.com>,
        linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
        andrew.cooper3@...rix.com, samitolvanen@...gle.com,
        mark.rutland@....com, hjl.tools@...il.com,
        alyssa.milburn@...ux.intel.com, ndesaulniers@...gle.com,
        gabriel.gomes@...ux.intel.com, rick.p.edgecombe@...el.com
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Sun, May 08, 2022 at 01:29:13AM -0700, Kees Cook wrote:
> On Wed, May 04, 2022 at 08:16:57PM +0200, Peter Zijlstra wrote:
> > 	FineIBT						kCFI
> > 
> > __fineibt_\hash:
> > 	xor	\hash, %r10	# 7
> > 	jz	1f		# 2
> > 	ud2			# 2
> > 1:	ret			# 1
> > 	int3			# 1
> > 
> > 
> > __cfi_\sym:					__cfi_\sym:
> > 							int3; int3				# 2
> > 	endbr			# 4			mov	\hash, %eax			# 5
> > 	call	__fineibt_\hash	# 5			int3; int3				# 2
> > \sym:						\sym:
> > 	...						...
> > 
> > 
> > caller:						caller:
> > 	movl	\hash, %r10d	# 6			cmpl	\hash, -6(%r11)			# 8
> > 	sub	$9, %r11	# 4			je	1f				# 2
> > 	call	*%r11		# 3			ud2					# 2
> > 	.nop 4			# 4 (or fixup r11)	call	__x86_indirect_thunk_r11	# 5
> 
> This looks good!
> 
> And just to double-check my understanding here... \sym is expected to
> start with endbr with IBT + kCFI?

Ah, the thinking was that 'if IBT then FineIBT', so the combination of
kCFI and IBT is of no concern. And since FineIBT will have the ENDBR in
the __cfi_\sym thing, \sym will not need it.

But thinking about this now I suppose __nocfi call symbols will stlil
need the ENDBR on. Objtool IBT validation would need to either find
ENDBR or a matching __cfi_\sym I suppose.

So I was talking to Joao on IRC the other day, and I realized that if
kCFI generates code as per the above, then we can do FineIBT purely
in-kernel. That is; have objtool generate a section of __cfi_\sym
locations. Then use the .retpoline_sites and .cfi_sites to rewrite kCFI
into the FineIBT form in multi pass:

 - read all the __cfi_\sym sites and collect all unique hash values

 - allocate (module) memory and write __fineibt_\hash functions for each
   unique hash value found

 - rewrite callers; nop out kCFI

 - rewrite all __cfi_\sym

 - rewrite all callers

 - enable IBT

And the same on module load I suppose.

But I've only thought about this, not actually implemented it, so who
knows what surprises are lurking there :-)

> Random extra thoughts... feel free to ignore. :) Given that both CFI
> schemes depend on an attacker not being able to construct an executable
> memory region that either starts with endbr (for FineIBT) or starts with
> hash & 2 bytes (for kCFI), we should likely take another look at where
> the kernel uses PAGE_KERNEL_EXEC.
> 
> It seems non-specialized use is entirely done via module_alloc(). Obviously
> modules need to stay as-is. So we're left with other module_alloc()
> callers: BPF JIT, ftrace, and kprobes.
> 
> Perhaps enabling CFI should tie bpf_jit_harden (which performs constant
> blinding) to the value of bpf_jit_enable? (i.e. either use BPF VM which
> reads from non-exec memory, or use BPF JIT with constant blinding.)
> 
> I *think* all the kprobes and ftrace stuff ends up using constructed
> direct calls, though, yes? So if we did bounds checking, we could
> "exclude" them as well as the BPF JIT. Though I'm not sure how
> controllable the content written to the kprobes and ftrace regions are,
> though?

Both ftrace and kprobe only write fairly simple tramplines based off of
a template. Neither has indirect calls.

> For exclusion, we could separate actual modules from the other
> module_alloc() users by maybe allocating in opposite directions from the
> randomized offset and check indirect calls against the kernel text bounds
> and the new modules-only bounds. Sounds expensive, though. Maybe PKS,
> but I can't imagine 2 MSR writes per indirect call would be fast. Hmm...

I'm not sure what problem you're trying to solve..