[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKUFkX0+zrvvajVgOiHb=hqaJA+7_WTFetweNZ2hzNHTKwhQbg@mail.gmail.com>
Date: Tue, 18 Feb 2025 11:18:34 -0800
From: Joao Moreira <joao@...rdrivepizza.com>
To: Jennifer Miller <jmill@....edu>
Cc: linux-hardening@...r.kernel.org, kees@...nel.org, samitolvanen@...gle.com
Subject: Re: [RFC] Circumventing FineIBT Via Entrypoints
Hey Jennifer, really cool work :)
Since you mentioned approaches to handling the issue, a possible
simple fix could be aligning function endbrs to 0x1, and keeping the
entry points at 0x0. Then, before doing the indirect call, just OR 0x1
the function address. It seems PeterZ has something else on his
pipeline, which sounds very reasonable for upstream and closes other
concerns raised by Kees. Yet, given you may be interested in
approaches beyond what is suitable for upstream, I felt like you could
be interested in this trick.
Thank you for the paper, I'm looking forward to reading it.
Best,
Joao
On Wed, Feb 12, 2025 at 1:08 PM Jennifer Miller <jmill@....edu> wrote:
>
> Hi All,
>
> As part of a recently accepted paper we demonstrated that syscall
> entrypoints can be misused on x86-64 systems to generically bypass
> FineIBT/KERNEL_IBT from forwards-edge control flow hijacking. We
> communicated this finding to s@k.o before submitting the paper and were
> encouraged to bring the issue to hardening after the paper was accepted to
> have a discussion on how to address the issue.
>
> The bypass takes advantage of the architectural requirement of entrypoints
> to begin with the endbr64 instruction and the ability to control GS_BASE
> from userspace via wrgsbase, from to the FSGSBASE extension, in order to
> perform a stack pivot to a ROP-chain.
>
> Here is a snippet of the 64-bit entrypoint code:
> ```
> entry_SYSCALL_64:
> <+0>: endbr64
> <+4>: swapgs
> <+7>: mov QWORD PTR gs:0x6014,rsp
> <+16>: jmp <entry_SYSCALL_64+36>
> <+18>: mov rsp,cr3
> <+21>: nop
> <+26>: and rsp,0xffffffffffffe7ff
> <+33>: mov cr3,rsp
> <+36>: mov rsp,QWORD PTR gs:0x32c98
> ```
>
> This is a valid target from any indirect callsite under FineIBT due to the
> endbr64 instruction and the lack of a software CFI check. After hijacking
> control flow to the entrypoint, executing swapgs will swap to the user
> controlled GS_BASE, which will be used to set the stack pointer, leading to
> a stack pivot. The rest of the entrypoint will execute with a hijacked
> GS_BASE on a user controlled stack. The stack page we use is one mapped in
> the user address space and from another thread we race overwriting returns
> addresses on the stack to pivot a second time to a ROP-chain. For this to
> succeed we required a large area of user-controlled kernel memory that can
> serve as the forged GS_BASE address, we did this by spraying 2MB
> Transparent Huge Pages to fill the kernel physical memory map with
> controlled 2MB allocations and guessing relative to the base address of the
> area to hit a page we control.
>
> We evaluated an approach to patching the issue in the paper but it touched
> the userspace API a bit, added an error code returned by syscalls if they
> are invoked with a kernel address in GS_BASE, which is not a great
> solution.
>
> Linus provided some thoughts on how to potentially address this issue
> in our communication with s@k.o, suggesting the kernel could make the
> KERNEL_GS_BASE match the GS_BASE value so both registers always contain a
> valid kernel address and a confusion induced by executing swapgs an extra
> time cannot occur, and restore the value of KERNEL_GS_BASE ahead of
> executing swapgs in the exit path.
>
> I started working on a patch based on the approach suggested by Linus but I
> haven't been able to get it passing the relevant x86 selftests yet. It
> turned out that it's more than the entrypoint code that needs to be
> modified for it to work, we need to correctly save and restore the user's
> GS_BASE across task switches and ensure it is updated correctly when set
> via arch_prctl and ptrace. Unfortunately, I lack familiarity with those
> parts of the kernel, and my understanding is that the paper will be made
> public in a couple weeks so I didn't want to delay too long on bringing the
> issue to this list.
>
> Assuming this is an issue you all feel is worth addressing, I will continue
> working on providing a patch. I'm concerned though that the overhead from
> adding a wrmsr on both syscall entry and exit to overwrite and restore the
> KERNEL_GS_BASE MSR may be quite high, so any feedback in regards to the
> approach or suggestions of alternate approaches to patching are welcome :)
>
> ~Jennifer
Powered by blists - more mailing lists