linux-hardening - Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202509050907.DE9A5ED2@keescook>
Date: Fri, 5 Sep 2025 09:19:29 -0700
From: Kees Cook <kees@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, Andrew Pinski <pinskia@...il.com>,
	Richard Biener <rguenther@...e.de>,
	Joseph Myers <josmyers@...hat.com>, Jan Hubicka <hubicka@....cz>,
	Richard Earnshaw <richard.earnshaw@....com>,
	Richard Sandiford <richard.sandiford@....com>,
	Marcus Shawcroft <marcus.shawcroft@....com>,
	Kyrylo Tkachov <kyrylo.tkachov@....com>,
	Kito Cheng <kito.cheng@...il.com>,
	Palmer Dabbelt <palmer@...belt.com>,
	Andrew Waterman <andrew@...ive.com>,
	Jim Wilson <jim.wilson.gcc@...il.com>,
	Dan Li <ashimida.1990@...il.com>,
	Sami Tolvanen <samitolvanen@...gle.com>,
	Ramon de C Valle <rcvalle@...gle.com>,
	Joao Moreira <joao@...rdrivepizza.com>,
	Nathan Chancellor <nathan@...nel.org>,
	Bill Wendling <morbo@...gle.com>, gcc-patches@....gnu.org,
	linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity
 infrastructure

On Fri, Sep 05, 2025 at 10:51:03AM +0200, Peter Zijlstra wrote:
> On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> > +- The check-call instruction sequence must be treated a single unit: it
> > +  cannot be rearranged or split or optimized. The pattern is that
> > +  indirect calls, "call *$target", get converted into:
> > +
> > +    mov $target_expression, %target ; only present if the expression was
> > +                                    ; not already %target register
> > +    load -$offset(%target), %tmp    ; load the typeid hash at target
> > +    cmp $hash, %tmp                 ; compare expected typeid with loaded
> > +    je .Lcheck_passed               ; jump to the indirect call
> > +  .Lkcfi_trap$N:                    ; label of trap insn
> > +    trap                            ; trap on failure, but arranged so
> > +                                    ; "permissive mode" falls through
> > +  .Lkcfi_call$N:                    ; label of call insn
> > +    call *%target                   ; actual indirect call
> > +
> > +  This pattern of call immediately after trap provides for the
> > +  "permissive" checking mode automatically: the trap gets handled,
> > +  a warning emitted, and then execution continues after the trap to
> > +  the call.
> 
> I know it is far too late to do anything here. But I've recently dug
> through a bunch of optimization manual and the like and that Jcc is
> about as bad as it gets :/
> 
> The old optimization manual states that forward jumps are assumed
> not-taken; while backward jumps are assumed taken.
> 
> The new wisdom is that any Jcc must be assumed not-taken; that is, the
> fallthrough case has the best throughput.

I would expect the cmp to be the slowest part of this sequence, and I
figured the both the trap and the call to be speculation barriers? I'm
not sure, though. Is changing the sequence actually useful?

> Here we have a forward branch which is assumed taken :-(

The constraints we have are:

- Linux x86 KCFI trap handler decodes the instructions from the trap
  backwards, but it uses exact offsets (-12 and -6).
- Control flow following the trap must make the call (for warn-only mode)

If we change this, we'd need to make the insn decoder smarter to likey
look at the insn AFTER the trap ("is it a direct jump?")

And then use this, which is ugly, but matches second constraint:

	cmp $hash %tmp
	jne .Ltrap
.Lcall:
	call *%target
	jmp .Ldone
.Ltrap:
	ud2
	jmp .Lcall
.Ldone:

+4 bytes for x86_64

-- 
Kees Cook