lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <edf50534-3c31-492d-b975-824bf19cf36d@citrix.com>
Date: Sun, 2 Mar 2025 22:31:35 +0000
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Rudolf Marek <r.marek@...embler.cz>, Jann Horn <jannh@...gle.com>
Cc: jmill@....edu, joao@...rdrivepizza.com, luto@...nel.org,
 samitolvanen@...gle.com, "Peter Zijlstra (Intel)" <peterz@...radead.org>,
 linux-hardening@...r.kernel.org, lkml <linux-kernel@...r.kernel.org>,
 x86 maintainers <x86@...nel.org>
Subject: Re: [RFC] Circumventing FineIBT Via Entrypoints

On 02/03/2025 7:16 pm, Rudolf Marek wrote:
> Dne 01. 03. 25 v 23:48 Rudolf Marek napsal(a):
>> I don't know how slow is to do the jump back via far jump.
>
> I did some micro benchmark on Raptorlake platform using other
> operating system I'm very familiar with.
>
> I added following sequence to the SYSCALL64 entrypoint:
>
>  .balign 16
> syscallentry64:
>     .byte 0x48
>     ljmp *jmpaddr(%rip)
> continuehere:
>      swapgs
> <...>
>
> jmpaddr:
> .quad continuehere
> .word KERN_OTHER_CS << 3
>
> And well, it is  1.5x slower. Unmodified syscall benchmark took on avg
> 261 cycles / 104 ns and the one with the indirect jump with %cs change
> took
> 386 cycles/ 154 ns.
>
> This whole thing is quite literally a trap next to a trap, because GAS
> wasn't adding REX.W prefix and somehow complained about ljmpq.

(I've not finished replying to your other email, but here's one bit
brought forward)

Sadly far jumps and calls are where Intel and AMD CPUs disagree on how
to decode the instruction stream.  Intel CPUs obey REX prefix for
operand size, while AMD do not.  i.e. AMD CPUs cannot far transfer to
kernel addresses, at all.

This is why you only see far returns generally, which do behave the same
between vendors but require a stack.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ