[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c46f5614-a82e-42fc-91eb-05e483a7df9c@citrix.com>
Date: Thu, 13 Feb 2025 01:31:30 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: jannh@...gle.com
Cc: jmill@....edu, joao@...rdrivepizza.com, kees@...nel.org,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
luto@...nel.org, samitolvanen@...gle.com,
"Peter Zijlstra (Intel)" <peterz@...radead.org>
Subject: Re: [RFC] Circumventing FineIBT Via Entrypoints
>> Assuming this is an issue you all feel is worth addressing, I will
>> continue working on providing a patch. I'm concerned though that the
>> overhead from adding a wrmsr on both syscall entry and exit to
>> overwrite and restore the KERNEL_GS_BASE MSR may be quite high, so
>> any feedback in regards to the approach or suggestions of alternate
>> approaches to patching are welcome :)
>
> Since the kernel, as far as I understand, uses FineIBT without
> backwards control flow protection (in other words, I think we assume
> that the kernel stack is trusted?),
This is fun indeed. Linux cannot use supervisor shadow stacks because
the mess around NMI re-entrancy (and IST more generally) requires ROP
gadgets in order to function safely. Implementing this with shadow
stacks active, while not impossible, is deemed to be prohibitively
complicated.
Linux's supervisor shadow stack support is waiting for FRED support,
which fixes both the NMI re-entrancy problem, and other exceptions
nesting within NMIs, as well as prohibiting the use of the SWAPGS
instruction as FRED tries to make sure that the correct GS is always in
context.
But, FRED support is slated for PantherLake/DiamondRapids which haven't
shipped yet, so are no use to the problem right now.
> could we build a cheaper
> check on that basis somehow? For example, maybe we could do something like:
>
> ```
> endbr64
> test rsp, rsp
> js slowpath
> swapgs
> ```
I presume it's been pointed out already, but there are 3 related
entrypoints here. SYSCALL64 which is discussed, SYSCALL32 and SYSENTER
which are related.
But, any other IDT entry is in a similar bucket. If we're corrupting a
function pointer or return address to redirect here, then the check of
CS(%rsp) to control the conditional SWAPGS is an OoB read in the callers
stack frame.
For IDT entries, checking %rsp is reasonable, because userspace can't
forge a kernel-like %rsp. However, SYSCALL64 specifically leaves %rsp
entirely attacker controlled (and even potentially non-canonical), so
I'm wondering what you hand in mind for the slowpath to truly
distinguish kernel context from user context?
~Andrew
Powered by blists - more mailing lists