[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202509081452.5AD50CAA@keescook>
Date: Mon, 8 Sep 2025 14:55:38 -0700
From: Kees Cook <kees@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, Andrew Pinski <pinskia@...il.com>,
Richard Biener <rguenther@...e.de>,
Joseph Myers <josmyers@...hat.com>, Jan Hubicka <hubicka@....cz>,
Richard Earnshaw <richard.earnshaw@....com>,
Richard Sandiford <richard.sandiford@....com>,
Marcus Shawcroft <marcus.shawcroft@....com>,
Kyrylo Tkachov <kyrylo.tkachov@....com>,
Kito Cheng <kito.cheng@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andrew Waterman <andrew@...ive.com>,
Jim Wilson <jim.wilson.gcc@...il.com>,
Dan Li <ashimida.1990@...il.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Ramon de C Valle <rcvalle@...gle.com>,
Joao Moreira <joao@...rdrivepizza.com>,
Nathan Chancellor <nathan@...nel.org>,
Bill Wendling <morbo@...gle.com>, gcc-patches@....gnu.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity
infrastructure
On Mon, Sep 08, 2025 at 05:32:58PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 05, 2025 at 09:19:29AM -0700, Kees Cook wrote:
> > On Fri, Sep 05, 2025 at 10:51:03AM +0200, Peter Zijlstra wrote:
> > > On Thu, Sep 04, 2025 at 05:24:10PM -0700, Kees Cook wrote:
> > > > +- The check-call instruction sequence must be treated a single unit: it
> > > > + cannot be rearranged or split or optimized. The pattern is that
> > > > + indirect calls, "call *$target", get converted into:
> > > > +
> > > > + mov $target_expression, %target ; only present if the expression was
> > > > + ; not already %target register
> > > > + load -$offset(%target), %tmp ; load the typeid hash at target
> > > > + cmp $hash, %tmp ; compare expected typeid with loaded
> > > > + je .Lcheck_passed ; jump to the indirect call
> > > > + .Lkcfi_trap$N: ; label of trap insn
> > > > + trap ; trap on failure, but arranged so
> > > > + ; "permissive mode" falls through
> > > > + .Lkcfi_call$N: ; label of call insn
> > > > + call *%target ; actual indirect call
> > > > +
> > > > + This pattern of call immediately after trap provides for the
> > > > + "permissive" checking mode automatically: the trap gets handled,
> > > > + a warning emitted, and then execution continues after the trap to
> > > > + the call.
> > >
> > > I know it is far too late to do anything here. But I've recently dug
> > > through a bunch of optimization manual and the like and that Jcc is
> > > about as bad as it gets :/
> > >
> > > The old optimization manual states that forward jumps are assumed
> > > not-taken; while backward jumps are assumed taken.
> > >
> > > The new wisdom is that any Jcc must be assumed not-taken; that is, the
> > > fallthrough case has the best throughput.
> >
> > I would expect the cmp to be the slowest part of this sequence, and I
> > figured the both the trap and the call to be speculation barriers? I'm
> > not sure, though. Is changing the sequence actually useful?
>
> The load can miss, in which case it is definitely the most expensive
> thing around.
>
> > > Here we have a forward branch which is assumed taken :-(
> >
> > The constraints we have are:
> >
> > - Linux x86 KCFI trap handler decodes the instructions from the trap
> > backwards, but it uses exact offsets (-12 and -6).
> > - Control flow following the trap must make the call (for warn-only mode)
> >
> > If we change this, we'd need to make the insn decoder smarter to likey
> > look at the insn AFTER the trap ("is it a direct jump?")
> >
> > And then use this, which is ugly, but matches second constraint:
> >
> > cmp $hash %tmp
> > jne .Ltrap
> > .Lcall:
> > call *%target
> > jmp .Ldone
> > .Ltrap:
> > ud2
> > jmp .Lcall
> > .Ldone:
>
> Ah, you can do something like:
>
> cmp $hash, %tmp
> jne +3
> nopl -42(%rax)
> call *%target
>
> which is only 2 bytes longer. Notably, that nopl is 4 bytes and the 4th
> byte is 0xd6 (aka UDB). This is an effective UDcc instruction based
> around a forward non-taken branch.
Oh right, I forgot about the nop encodings.
> But yeah, I don't know if it is worth changing this. Its just that I've
> been staring at these things far too much of late :-)
To do this we'd need to change the Linux trap handler and Clang's
implementation, so yeah, I'm inclined to just leave it as-is until we
have a stronger reason to change it.
--
Kees Cook
Powered by blists - more mailing lists