linux-kernel - Re: [RFC] Circumventing FineIBT Via Entrypoints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8b82b394-3f54-437b-bd3a-7ac0eabda687@citrix.com>
Date: Tue, 25 Feb 2025 21:14:01 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Rudolf Marek <r.marek@...embler.cz>, Jann Horn <jannh@...gle.com>
Cc: jmill@....edu, joao@...rdrivepizza.com, luto@...nel.org,
 samitolvanen@...gle.com, "Peter Zijlstra (Intel)" <peterz@...radead.org>,
 linux-hardening@...r.kernel.org, lkml <linux-kernel@...r.kernel.org>,
 x86 maintainers <x86@...nel.org>
Subject: Re: [RFC] Circumventing FineIBT Via Entrypoints

On 25/02/2025 8:06 pm, Rudolf Marek wrote:
> Hi Andrew,
>
> Dne 25. 02. 25 v 19:10 Andrew Cooper napsal(a):
>> Very cunning.  Yes it does, but the state needs to be safe to IRET back
>> to, and ...
>
> ... And intellectually very pleasing!
>
>>> Would it work to have KERNEL_CS as last entry in the GDT table?
>>> Therefore executing SYSCALL would set the CS as usual,
>>> but the numeric value of SS selector would be larger than GDT limit?
>>
>> ... this isn't safe.  Any exception/interrupt will yield #SS when trying
>> to load an out-of-limit %ss.> i.e. a wrongly-timed NMI will take out
>> the system with a very bizarre
>> looking oops.
>
> Hmm I was hoping that "the reader" will perform this NMI/#MC exercise :)

As stand-in for "the reader", I'll point out that you need to add #DB to
that list or you're in for a rude surprise when running the x86 selftests.

>
> The SYSCALL/SYSENTER startup has interrupts disabled, so it is the
> problem of NMI/#MC
> handler which would need deal with the normal case and attack case.

Right, but in the case of the attack, regular interrupts are most likely
enabled too.  And writing this has just caused me to realise a
yet-more-fun case.

An interrupt hitting the syscall entry path (prior to SWAPGS) will cause
the interrupt handler's CPL check and conditional SWAPGS to do the wrong
thing and switch onto the user GS base too.  (Prior research e.g.
GhostRace has shown how to get an hrtimer to reliably hit an instruction
boundary.)

i.e. you'd need paranoid_entry on every vector, not just the IST ones.

>
> It would need to check if it was executing that critical part of
> syscall64 entry
> from endbr64 to checkselector section, and if yes, the saved %ss needs
> to be
> "impossible" one. If it isn't -> panic.
>
> For non-attack case it just needs to forward RIP after the check...
>
>> You can do this in a less fatal way by e.g. having in-GDT form have a
>> segment limit, but any exception/interrupt will resync the out-of-sync
>> state, and break detection.  Also it would make the segment unusable for
>> compatibility userspace, where the limit would take effect.
>
> Yeah couldn't figure out what else could work "vice-versa" :(
>  
>> Finally, while this potentially gives us an option for SYSCALL and maybe
>> SYSENTER, it doesn't help with any of the main IDT entrypoints which can
>> also be attacked.
>
> I see, sorry I wasn't aware of this. But if I recall correctly only
> "paranoid"
> IDT entries do something with swapgs. But is there also some stack
> pivot where
> it would depend on GS? Or is it somewhat unrelated issue, that you
> might just
> redirect to "any endbr64" which are IDT entrypoints?
>
> Maybe you can share some details of how the attack would work in this
> case,
> or point me somewhere where I can read about it.
>
> If it is "any endbr64" case, would it work to just do "sanity check"
> of the exception stackframe?

The problem is type confusion.  Because ENDBR marks both the regular
function callees, and the system entrypoints (256*IDT + 2*SYSCALL +
SYSENTER), a function pointer corrupted to refer to a system entrypoint
will pass the CET-IBT check and not yield #CP.

All entrypoints then conditionally (IDT) or unconditionally
(SYSCALL/SYSENTER) SWAPGS.  For the attack case, this switches back onto
the user gs base.

Interrupts and exceptions look at %cs in the IRET frame to judge whether
to SWAPGS or not (and this is one of the main things that paranoid_entry
does differently).  In the case of the attack, there's no IRET frame
pushed on the stack and the read of %cs is out-of-bounds, most likely
the stack frame of the function which followed the corrupt function pointer.

The SYSCALL entrypoint is simply the easiest to pivot on, but all can be
attacked in this manner.  Fixing only the SYSCALL entrypoint doesn't
improve things much.

Peter Zijlstra has added a FineIBT=paranoid mode which performs the hash
check ahead of calling the function pointer, which ought to mitigate
this but at even higher overhead.

~Andrew