[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <a793c733-267d-4930-8ee2-0fc0f24c3538@zytor.com>
Date: Sun, 28 Apr 2024 15:28:44 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Xin Li <xin@...or.com>, Andrew Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Kees Cook <keescook@...omium.org>
Cc: LKML <linux-kernel@...r.kernel.org>, "x86@...nel.org" <x86@...nel.org>
Subject: x86: dynamic pt_regs pointer and kernel stack randomization
So the issue of having an explicit pointer to the user space pt_regs
structure has come up in the context of future FRED extensions, which
may increase the size of the exception stack frame under some
circumstances, which may not be constant in the first place.
It struck me that this can be combined with kernel stack randomization
in such a way that it mostly amortizes the cost of both.
Downside: for best performance, it should be pushed into assembly
entry/exit code, although that is technically not *required* (and it is
of course possible to do it in C on IDT but in the one single assembly
entry stub for FRED.)
In the FRED code it would look like [simplified]:
asm_fred_entrypoint_user:
/* Current code */
/* ... */
movq %rsp,%rdi /* part of FRED_ENTER */
/* New code */
movq %rdi,PER_CPU_VAR(user_pt_regs_ptr) /* [1] */
#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET
subq PER_CPU_VAR(kstack_offset),%rsp /* [2] */
#endif
/* CFI annotation */
/* Current code */
call fred_entry_from_user
asm_fred_exit_uTser:
/* New code */
movq PER_CPU_VAR(user_pt_regs_ptr),%rsp
/* CFI annotation */
/* Current code */
/* ... */
ERETU
[1] - This instruction can be pushed into the C code in
fred_entry_from_user() without changing the functionality in any way.
[2] - This is the ONLY instruction in this sequence that would be
specific to CONFIG_RANDOMIZE_KSTACK_OFFSET, and it probably isn't even
worth patching out.
This requires a 64-bit premasked value to be generated byc
choose_random_kstack_offset() which would seem to be a better option for
performance, especially since there is already arithmetic done at that
time. Otherwise it requires three instructions. This means the
randomness accumulator ends up being in a separate variable from the
premasked value. This could be further very slightly optimized by adding
the actual stack location and making this a movq, but then that value
has to be context-switched; this is probably not all that useful.
The masking needs to consider alignment, which the current code doesn't;
that by itself adds additional code to the current code sequences.
That is literally *all* the code that is needed to replace
add_random_kstack_offset(). It doesn't block tailcall optimization
anywhere. If user_pt_regs_ptr and kstack_offset share a cache line it
becomes even cheaper.
Note that at least in the FRED case this code would be invoked even on
events other than system calls, some of which may be controllable by the
user, like page faults. I am guessing that this is actually a plus.
-hpa
Powered by blists - more mailing lists