[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202509292310.BB6055DC@keescook>
Date: Mon, 29 Sep 2025 23:18:46 -0700
From: Kees Cook <kees@...nel.org>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, Andrew Pinski <pinskia@...il.com>,
Jakub Jelinek <jakub@...hat.com>, Martin Uecker <uecker@...raz.at>,
Richard Biener <rguenther@...e.de>,
Joseph Myers <josmyers@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Jeff Law <jeffreyalaw@...il.com>, Jan Hubicka <hubicka@....cz>,
Richard Earnshaw <richard.earnshaw@....com>,
Richard Sandiford <richard.sandiford@....com>,
Marcus Shawcroft <marcus.shawcroft@....com>,
Kyrylo Tkachov <kyrylo.tkachov@....com>,
Kito Cheng <kito.cheng@...il.com>,
Palmer Dabbelt <palmer@...belt.com>,
Andrew Waterman <andrew@...ive.com>,
Jim Wilson <jim.wilson.gcc@...il.com>,
Dan Li <ashimida.1990@...il.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Ramon de C Valle <rcvalle@...gle.com>,
Joao Moreira <joao@...rdrivepizza.com>,
Nathan Chancellor <nathan@...nel.org>,
Bill Wendling <morbo@...gle.com>, gcc-patches@....gnu.org,
linux-hardening@...r.kernel.org
Subject: Re: [PATCH v4 6/7] arm: Add ARM 32-bit Kernel Control Flow Integrity
implementation
On Mon, Sep 29, 2025 at 11:59:15AM +0200, Ard Biesheuvel wrote:
> On Fri, 26 Sept 2025 at 05:02, Kees Cook <kees@...nel.org> wrote:
> >
> > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> > index 422ae549b65b..c3b9f16ea872 100644
> > --- a/gcc/config/arm/arm.md
> > +++ b/gcc/config/arm/arm.md
> ...
> > +/* Output the assembly for a KCFI checked call instruction. INSN is the
> > + RTL instruction being processed. OPERANDS is the array of RTL operands
> > + where operands[0] is the call target register, operands[2] is the KCFI
> > + type ID constant. Returns an empty string as all output is handled by
> > + direct assembly generation. */
> > +
> > +const char *
> > +arm_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> > +{
> > + /* KCFI type id. */
> > + uint32_t type_id = INTVAL (operands[2]);
> > +
> > + /* Calculate typeid offset from call target. */
> > + HOST_WIDE_INT offset = -kcfi_typeid_offset;
> > +
> > + /* Generate custom label names. */
> > + char trap_name[32];
> > + char call_name[32];
> > + ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", kcfi_labelno);
> > + ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", kcfi_labelno);
> > +
> > + /* Create memory operand for the type load. */
> > + rtx mem_op = gen_rtx_MEM (SImode,
> > + gen_rtx_PLUS (SImode, operands[0],
> > + GEN_INT (offset)));
> > + rtx temp_operands[6];
> > +
> > + /* Normally we can use r12 as our scratch register. */
> > + unsigned scratch_reg_num = IP_REGNUM;
> > + /* If register pressure has made r12 our target register, we need to pick
> > + a different register. We don't want to spill our target register
> > + because on reload at the end of the KCFI check, we'd be producing
> > + the very kind of call gadget we were trying to protect against:
> > + "pop %target; call %target". In this case, use r3 as our scratch
> > + register. But since r3 may be used for function arguments, we need
> > + to check if it is being used for that and only spill/reload if that
> > + happens. Any spill/reload of r3 due to making a call will already
> > + have been managed by the register allocator, so we only have to care
> > + about not clobbering the argument value it may be carrying into the
> > + call here. Also use r3 when r12 is a fixed register. */
> > + if (REGNO (operands[0]) == scratch_reg_num
> > + || fixed_regs[scratch_reg_num])
> > + scratch_reg_num = LAST_ARG_REGNUM;
> > + rtx scratch_reg = gen_rtx_REG (SImode, scratch_reg_num);
> > +
> > + /* We only need to spill r3 if it's actually used by the call. */
> > + bool need_spill = (scratch_reg_num == LAST_ARG_REGNUM)
> > + && reg_overlap_mentioned_p (scratch_reg, insn);
> > +
> > + /* Calculate trap immediate. */
> > + unsigned addr_reg_num = REGNO (operands[0]);
> > + /* The scratch register is always clobbered by eor seq: use 0x1F. */
> > + unsigned udf_immediate = 0x8000 | (0x1F << 5) | (addr_reg_num & 31);
> > +
>
> I take it this means you still need to decode the instructions in the
> kernel to obtain the expected type id?
Currently, yes.
> Can't you insert the actual register index here, and defer the reload
> until after the UDF? That way, the scratch register will always
> contain the XOR of the actual vs expected typeids when taking the
> trap.
My instinct is to avoid any kind of load/call gadget (as a ROP target),
even if the controlled register is only the 4th argument. The risk is
much lower, but it seemed like reducing the risk to 0 requires just a
little help on the kernel side after taking the trap (and x86 already
does this reliably).
I suppose as an alternative I could use the index when it's not r3, but
then Linux would need to read the destination memory to rebulid the XOR?
I think that's even more fragile... I think it'd be best to just read
back the prior 5 instructions before the trap. It's reliable. :)
-Kees
--
Kees Cook
Powered by blists - more mailing lists