linux-kernel - x86: dynamic pt_regs pointer and kernel stack randomization

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <a793c733-267d-4930-8ee2-0fc0f24c3538@zytor.com>
Date: Sun, 28 Apr 2024 15:28:44 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Xin Li <xin@...or.com>, Andrew Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Kees Cook <keescook@...omium.org>
Cc: LKML <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>
Subject: x86: dynamic pt_regs pointer and kernel stack randomization

So the issue of having an explicit pointer to the user space pt_regs 
structure has come up in the context of future FRED extensions, which 
may increase the size of the exception stack frame under some 
circumstances, which may not be constant in the first place.

It struck me that this can be combined with kernel stack randomization 
in such a way that it mostly amortizes the cost of both.

Downside: for best performance, it should be pushed into assembly 
entry/exit code, although that is technically not *required* (and it is 
of course possible to do it in C on IDT but in the one single assembly 
entry stub for FRED.)

In the FRED code it would look like [simplified]:

asm_fred_entrypoint_user:
	/* Current code */
	/* ... */
	movq %rsp,%rdi		/* part of FRED_ENTER */

	/* New code */
	movq %rdi,PER_CPU_VAR(user_pt_regs_ptr)	/* [1] */
#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
	subq PER_CPU_VAR(kstack_offset),%rsp	/* [2] */
#endif
	/* CFI annotation */

	/* Current code */
	call fred_entry_from_user

asm_fred_exit_uTser:
	/* New code */
	movq PER_CPU_VAR(user_pt_regs_ptr),%rsp
	/* CFI annotation */

	/* Current code */
	/* ... */
	ERETU

[1] - This instruction can be pushed into the C code in 
fred_entry_from_user() without changing the functionality in any way.

[2] - This is the ONLY instruction in this sequence that would be 
specific to CONFIG_RANDOMIZE_KSTACK_OFFSET, and it probably isn't even 
worth patching out.

This requires a 64-bit premasked value to be generated byc 
choose_random_kstack_offset() which would seem to be a better option for 
performance, especially since there is already arithmetic done at that 
time. Otherwise it requires three instructions. This means the 
randomness accumulator ends up being in a separate variable from the 
premasked value. This could be further very slightly optimized by adding 
the actual stack location and making this a movq, but then that value 
has to be context-switched; this is probably not all that useful.

The masking needs to consider alignment, which the current code doesn't; 
that by itself adds additional code to the current code sequences.

That is literally *all* the code that is needed to replace 
add_random_kstack_offset(). It doesn't block tailcall optimization 
anywhere. If user_pt_regs_ptr and kstack_offset share a cache line it 
becomes even cheaper.

Note that at least in the FRED case this code would be invoked even on 
events other than system calls, some of which may be controllable by the 
user, like page faults. I am guessing that this is actually a plus.

	-hpa