[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202509120157.9681B7FC6D@keescook>
Date: Fri, 12 Sep 2025 02:03:08 -0700
From: Kees Cook <kees@...nel.org>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, Andrew Pinski <pinskia@...il.com>,
Richard Biener <rguenther@...e.de>,
Joseph Myers <josmyers@...hat.com>, Jan Hubicka <hubicka@....cz>,
Richard Earnshaw <richard.earnshaw@....com>,
Richard Sandiford <richard.sandiford@....com>,
Marcus Shawcroft <marcus.shawcroft@....com>,
Kyrylo Tkachov <kyrylo.tkachov@....com>,
Kito Cheng <kito.cheng@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andrew Waterman <andrew@...ive.com>,
Jim Wilson <jim.wilson.gcc@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Dan Li <ashimida.1990@...il.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Ramon de C Valle <rcvalle@...gle.com>,
Joao Moreira <joao@...rdrivepizza.com>,
Nathan Chancellor <nathan@...nel.org>,
Bill Wendling <morbo@...gle.com>, gcc-patches@....gnu.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity
implementation
On Thu, Sep 11, 2025 at 09:49:56AM +0200, Ard Biesheuvel wrote:
> On Fri, 5 Sept 2025 at 02:24, Kees Cook <kees@...nel.org> wrote:
> >
> > Implement ARM 32-bit KCFI backend supporting ARMv7+:
> >
> > - Function preamble generation using .word directives for type ID storage
> > at -4 byte offset from function entry point (no prefix NOPs needed due to
> > 4-byte instruction alignment).
> >
> > - Use movw/movt instructions for 32-bit immediate loading.
> >
> > - Trap debugging through UDF instruction immediate encoding following
> > AArch64 BRK pattern for encoding registers with useful contents.
> >
> > - Scratch register allocation using r0/r1 following ARM procedure call
> > standard for caller-saved temporary registers, though they get
> > stack spilled due to register pressure.
> >
> > Assembly Code Pattern for ARM 32-bit:
> > push {r0, r1} ; Spill r0, r1
> > ldr r0, [target, #-4] ; Load actual type ID from preamble
> > movw r1, #type_id_low ; Load expected type (lower 16 bits)
> > movt r1, #type_id_high ; Load upper 16 bits with top instruction
> > cmp r0, r1 ; Compare type IDs directly
> > pop [r0, r1] ; Reload r0, r1
>
> We could avoid the MOVW/MOVT pair and the spilling by doing something
> along the lines of
>
> ldr ip, [target, #-4]
> eor ip, ip, #type_id[0]
> eor ip, ip, #type_id[1] << 8
> eor ip, ip, #type_id[2] << 16
> eors ip, ip, #type_id[3] << 24
> ldrne ip, =type_id[3:0]
Ah-ha, nice. And it could re-load the type_id on the slow path instead
of unconditionally, I guess? (So no "ne" suffix needed there.)
...
eors ip, ip, #type_id[3] << 24
beq .Lkcfi_call
.Lkcfi_trap:
ldr ip, =type_id[3:0]
udf #nnn
.Lkcfi_call:
blx target
>
> Note that IP (R12) should be dead before a function call. Here it is
> conditionally loaded with the expected target typeid, removing the
> need to decode the instructions to recover it when the trap occurs.
>
> This should compile to Thumb2 as well as ARM encodings.
Won't IP get used as the target register if r0-r3 are used for passing
arguments? AAPCS implies this is how it'll go (4 arguments in registers,
the rest on stack), but when I tried to force this to happen, it looked
like it'd only pass 3 via registers, and would make the call with r3.
I can't see if this is safe to unconditionally use IP?
--
Kees Cook
Powered by blists - more mailing lists