linux-hardening - Re: [PATCH v4 6/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202509292310.BB6055DC@keescook>
Date: Mon, 29 Sep 2025 23:18:46 -0700
From: Kees Cook <kees@...nel.org>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Qing Zhao <qing.zhao@...cle.com>, Andrew Pinski <pinskia@...il.com>,
	Jakub Jelinek <jakub@...hat.com>, Martin Uecker <uecker@...raz.at>,
	Richard Biener <rguenther@...e.de>,
	Joseph Myers <josmyers@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Jeff Law <jeffreyalaw@...il.com>, Jan Hubicka <hubicka@....cz>,
	Richard Earnshaw <richard.earnshaw@....com>,
	Richard Sandiford <richard.sandiford@....com>,
	Marcus Shawcroft <marcus.shawcroft@....com>,
	Kyrylo Tkachov <kyrylo.tkachov@....com>,
	Kito Cheng <kito.cheng@...il.com>,
	Palmer Dabbelt <palmer@...belt.com>,
	Andrew Waterman <andrew@...ive.com>,
	Jim Wilson <jim.wilson.gcc@...il.com>,
	Dan Li <ashimida.1990@...il.com>,
	Sami Tolvanen <samitolvanen@...gle.com>,
	Ramon de C Valle <rcvalle@...gle.com>,
	Joao Moreira <joao@...rdrivepizza.com>,
	Nathan Chancellor <nathan@...nel.org>,
	Bill Wendling <morbo@...gle.com>, gcc-patches@....gnu.org,
	linux-hardening@...r.kernel.org
Subject: Re: [PATCH v4 6/7] arm: Add ARM 32-bit Kernel Control Flow Integrity
 implementation

On Mon, Sep 29, 2025 at 11:59:15AM +0200, Ard Biesheuvel wrote:
> On Fri, 26 Sept 2025 at 05:02, Kees Cook <kees@...nel.org> wrote:
> >
> > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> > index 422ae549b65b..c3b9f16ea872 100644
> > --- a/gcc/config/arm/arm.md
> > +++ b/gcc/config/arm/arm.md
> ...
> > +/* Output the assembly for a KCFI checked call instruction.  INSN is the
> > +   RTL instruction being processed.  OPERANDS is the array of RTL operands
> > +   where operands[0] is the call target register, operands[2] is the KCFI
> > +   type ID constant.  Returns an empty string as all output is handled by
> > +   direct assembly generation.  */
> > +
> > +const char *
> > +arm_output_kcfi_insn (rtx_insn *insn, rtx *operands)
> > +{
> > +  /* KCFI type id.  */
> > +  uint32_t type_id = INTVAL (operands[2]);
> > +
> > +  /* Calculate typeid offset from call target.  */
> > +  HOST_WIDE_INT offset = -kcfi_typeid_offset;
> > +
> > +  /* Generate custom label names.  */
> > +  char trap_name[32];
> > +  char call_name[32];
> > +  ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", kcfi_labelno);
> > +  ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", kcfi_labelno);
> > +
> > +  /* Create memory operand for the type load.  */
> > +  rtx mem_op = gen_rtx_MEM (SImode,
> > +                           gen_rtx_PLUS (SImode, operands[0],
> > +                                         GEN_INT (offset)));
> > +  rtx temp_operands[6];
> > +
> > +  /* Normally we can use r12 as our scratch register.  */
> > +  unsigned scratch_reg_num = IP_REGNUM;
> > +  /* If register pressure has made r12 our target register, we need to pick
> > +     a different register.  We don't want to spill our target register
> > +     because on reload at the end of the KCFI check, we'd be producing
> > +     the very kind of call gadget we were trying to protect against:
> > +     "pop %target; call %target".  In this case, use r3 as our scratch
> > +     register.  But since r3 may be used for function arguments, we need
> > +     to check if it is being used for that and only spill/reload if that
> > +     happens.  Any spill/reload of r3 due to making a call will already
> > +     have been managed by the register allocator, so we only have to care
> > +     about not clobbering the argument value it may be carrying into the
> > +     call here.  Also use r3 when r12 is a fixed register.  */
> > +  if (REGNO (operands[0]) == scratch_reg_num
> > +      || fixed_regs[scratch_reg_num])
> > +    scratch_reg_num = LAST_ARG_REGNUM;
> > +  rtx scratch_reg = gen_rtx_REG (SImode, scratch_reg_num);
> > +
> > +  /* We only need to spill r3 if it's actually used by the call.  */
> > +  bool need_spill = (scratch_reg_num == LAST_ARG_REGNUM)
> > +                   && reg_overlap_mentioned_p (scratch_reg, insn);
> > +
> > +  /* Calculate trap immediate.  */
> > +  unsigned addr_reg_num = REGNO (operands[0]);
> > +  /* The scratch register is always clobbered by eor seq: use 0x1F.  */
> > +  unsigned udf_immediate = 0x8000 | (0x1F << 5) | (addr_reg_num & 31);
> > +
> 
> I take it this means you still need to decode the instructions in the
> kernel to obtain the expected type id?

Currently, yes.

> Can't you insert the actual register index here, and defer the reload
> until after the UDF? That way, the scratch register will always
> contain the XOR of the actual vs expected typeids when taking the
> trap.

My instinct is to avoid any kind of load/call gadget (as a ROP target),
even if the controlled register is only the 4th argument. The risk is
much lower, but it seemed like reducing the risk to 0 requires just a
little help on the kernel side after taking the trap (and x86 already
does this reliably).

I suppose as an alternative I could use the index when it's not r3, but
then Linux would need to read the destination memory to rebulid the XOR?
I think that's even more fragile... I think it'd be best to just read
back the prior 5 instructions before the trap. It's reliable. :)

-Kees

-- 
Kees Cook