[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAMj1kXHMVBqOvk-zzSsv-0sSWgfKxWCj8Vv+FPyHO_8SUWNgPw@mail.gmail.com>
Date: Mon, 21 Jul 2025 16:10:03 +1000
From: Ard Biesheuvel <ardb@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: 刘海燕 (Haiyan Liu) <haiyan.liu@...soc.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
"rust-for-linux@...r.kernel.org" <rust-for-linux@...r.kernel.org>,
代子为 (Ziwei Dai) <Ziwei.Dai@...soc.com>,
周平 (Ping Zhou/9032) <Ping.Zhou1@...soc.com>,
杨丽娜 (Lina Yang) <lina.yang@...soc.com>,
王双 (Shuang Wang) <shuang.wang@...soc.com>,
Alice Ryhl <aliceryhl@...gle.com>, Miguel Ojeda <ojeda@...nel.org>,
Matthew Maurer <mmaurer@...gle.com>, Sami Tolvanen <samitolvanen@...gle.com>
Subject: Re: Meet compiled kernel binaray abnormal issue while enabling
generic kasan in kernel 6.12 with some default KBUILD_RUSTFLAGS on
On Thu, 17 Jul 2025 at 20:39, Mark Rutland <mark.rutland@....com> wrote:
>
> Hi,
>
> From a quick scan, I think this might have something to do with
> UNWIND_PATCH_PAC_INTO_SCS, notes below.
>
> On Mon, Jul 14, 2025 at 03:12:33AM +0000, 刘海燕 (Haiyan Liu) wrote:
> > I am enabling generic kasan feature in kernel 6.12, and met kernel boot crash.
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > pc : do_basic_setup+0x6c/0xac
> > lr : do_basic_setup+0x88/0xac
> > sp : ffffffc080087e40
>
> Can you say which hardware this is on? Given this is a NULL-dereference,
> this looks ike a dodgy pointer (or memory corruption) rather than a PAC
> failure.
>
> > After debug, I find some error in do_ctors().
> > Normally, the complier should insert the paciasp instruction at the function entry so that its corresponding autiasp instruction is used to validate the return address when the function returns.
> > NSX:FFFFFFC0800A840C|F800865E asan.module_ctor: str x30,[x18],#0x8;x30,[x18],#8
> > NSX:FFFFFFC0800A8410|A9BF7BFD stp x29,x30,[sp,#-0x10]! ; x29,x30,[sp,#-16]!
> > NSX:FFFFFFC0800A8414|910003FD mov x29,sp
> > NSX:FFFFFFC0800A8418|B0023420 adrp x0,0xFFFFFFC08472D000
> > NSX:FFFFFFC0800A841C|91390000 add x0,x0,#0xE40 ; x0,x0,#3648
> > NSX:FFFFFFC0800A8420|528000A1 mov w1,#0x5 ; w1,#5
> > NSX:FFFFFFC0800A8424|9422AF50 bl 0xFFFFFFC080954164 ; __asan_register_globals
> > NSX:FFFFFFC0800A8428|A8C17BFD ldp x29,x30,[sp],#0x10 ; x29,x30,[sp],#16
> > NSX:FFFFFFC0800A842C|F85F8E5E ldr x30,[x18,#-0x8]! ; x30,[x18,#-8]!
> > NSX:FFFFFFC0800A8430|D65F03C0 ret
>
> Here you evidently have shadow call stack enabled...
>
> > NSX:FFFFFFC0800A8478|D503233F asan.module_ctor: paciasp
> > NSX:FFFFFFC0800A847C|A9BF7BFD stp x29,x30,[sp,#-0x10]! ; x29,x30,[sp,#-16]!
> > NSX:FFFFFFC0800A8480|910003FD mov x29,sp
> > NSX:FFFFFFC0800A8484|B0023420 adrp x0,0xFFFFFFC08472D000
> > NSX:FFFFFFC0800A8488|913E0000 add x0,x0,#0xF80 ; x0,x0,#3968
> > NSX:FFFFFFC0800A848C|52800021 mov w1,#0x1 ; w1,#1
> > NSX:FFFFFFC0800A8490|9422AF35 bl 0xFFFFFFC080954164 ; __asan_register_globals
> > NSX:FFFFFFC0800A8494|A8C17BFD ldp x29,x30,[sp],#0x10 ; x29,x30,[sp],#16
> > NSX:FFFFFFC0800A8498|D50323BF autiasp
> > NSX:FFFFFFC0800A849C|D65F03C0 ret
>
> ... but here you evidently don't, and have PAC instead.
>
> Are these from the same kernel Image?
>
> Are these decoded from the static kernel binary, or are these dumps from
> memory once a kernel has booted (or is in the process of booting)?
>
> > But actually, in two asan.module_ctor functions, there is only autiasp instruction inserted before return, for validation of return address, while paciasp instruction is missing before.
> > NSX:FFFFFFC0800A72D8|F800865E asan.module_ctor: str x30,[x18],#0x8 ; x30,[x18],#8
> > NSX:FFFFFFC0800A72DC|F81F0FFE str x30,[sp,#-0x10]! ; x30,[sp,#-16]!
> > NSX:FFFFFFC0800A72E0|B00233C0 adrp x0,0xFFFFFFC084720000
> > NSX:FFFFFFC0800A72E4|91350000 add x0,x0,#0xD40 ; x0,x0,#3392
> > NSX:FFFFFFC0800A72E8|52803D61 mov w1,#0x1EB ; w1,#491
> > NSX:FFFFFFC0800A72EC|9422B39E bl 0xFFFFFFC080954164 ; __asan_register_globals
> > NSX:FFFFFFC0800A72F0|F84107FE ldr x30,[sp],#0x10 ; x30,[sp],#16
> > NSX:FFFFFFC0800A72F4|D50323BF autiasp
> > NSX:FFFFFFC0800A72F8|D65F03C0 ret
>
> Thas has a mixture of SCS and PAC; there's a shadow call stack prologue
> but a PAC epilogue:
>
> str x30, [x18], #8 // SCS
> ...
> autiasp // PAC
>
> ... so I'll hazard a guess that these are dumps from memory, and you
> have UNWIND_PATCH_PAC_INTO_SCS selected. Assuming that is the case,
> either this dump has been made mid-patching, or the patching has gone
> wrong somehow and left the prologues/epilogues in an inconsistent state
> (and the NULL dereference could be a secondary effect of that).
>
> Ard, does that sound plausible to you?
>
Yes, that is definitely possible.
Since commit 54c968bec344 ("arm64: Apply dynamic shadow call stack
patching in two passes") we are more careful about patching a function
in its entirety or not at all if any of the DWARF metadata is
misunderstood (or invalid). However, if the DWARF metadata is
inaccurate, but does not trigger an error, the patching will happen
and an error such as this one is likely to occur as a result.
Note that PACIASP and AUTIASP do not necessarily occur in pairs, so it
is not generally feasible to validate the DWARF against the code,
especially at runtime. However, a function (or FDE frame) that has one
PACIASP should at least have one AUTIASP too.
> I can't see why that would depend on KBUILD_RUSTFLAGS, but maybe the
> DWARF generated by rustc has confused the patching code somehow, or the
> linker has aggregated that in a suprising way.
>
I would suspect the DWARF metadata in this case. There are valid cases
where the DW_CFA_negate_ra_state annotation is attached to an
instruction other than PACIASP or AUTIASP, and so we are not able to
detect the case where the annotation is misplaced (i.e., attached to
the preceding or subsequent instruction).
So the important thing to check here is whether the objects in
question have the correct DWARF annotations for these
asan.module_ctor() routines. This can be done using 'llvm-readelf
--unwind' (example below, using an arbitrary object from a defconfig
build with kasan, rust and dynamic shadow call stack enabled): in this
case, both routines are correctly annotated, i.e., that the return
address (RA) state toggles to signed at offset 0x4 and toggles back to
unsigned/authenticated at 0x24.
$ llvm-objdump -d kernel/seccomp.o
Disassembly of section .text.asan.module_ctor:
0000000000000000 <asan.module_ctor>:
0: d503233f paciasp
4: a9bf7bfd stp x29, x30, [sp, #-0x10]!
8: 910003fd mov x29, sp
c: 90000000 adrp x0, 0x0 <asan.module_ctor>
10: 91000000 add x0, x0, #0x0
14: 528003c1 mov w1, #0x1e // =30
18: 94000000 bl 0x18 <asan.module_ctor+0x18>
1c: a8c17bfd ldp x29, x30, [sp], #0x10
20: d50323bf autiasp
24: d65f03c0 ret
Disassembly of section .text.asan.module_dtor:
0000000000000000 <asan.module_dtor>:
0: d503233f paciasp
4: a9bf7bfd stp x29, x30, [sp, #-0x10]!
8: 910003fd mov x29, sp
c: 90000000 adrp x0, 0x0 <asan.module_dtor>
10: 91000000 add x0, x0, #0x0
14: 528003c1 mov w1, #0x1e // =30
18: 94000000 bl 0x18 <asan.module_dtor+0x18>
1c: a8c17bfd ldp x29, x30, [sp], #0x10
20: d50323bf autiasp
24: d65f03c0 ret
$ llvm-readelf -u kernel/seccomp.o
(this requires a bit of manual inspection, given that readelf does not
take the .rela.eh_frame section into account, and so the initial
locations are section relative, and you're looking for FDE frames that
have initial location 0x0)
...
[0x58c] FDE length=40 cie=[0x0]
initial_location: 0x0
address_range: 0x28 (end : 0x28)
Program:
DW_CFA_advance_loc: 4 to 0x4
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_advance_loc: 4 to 0x8
DW_CFA_def_cfa_offset: +16
DW_CFA_advance_loc: 4 to 0xc
DW_CFA_def_cfa: reg29 +16
DW_CFA_offset: reg30 -8
DW_CFA_offset: reg29 -16
DW_CFA_advance_loc: 16 to 0x1c
DW_CFA_def_cfa: reg31 +16
DW_CFA_advance_loc: 4 to 0x20
DW_CFA_def_cfa_offset: +0
DW_CFA_advance_loc: 4 to 0x24
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_restore: reg30
DW_CFA_restore: reg29
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
[0x5b8] FDE length=44 cie=[0x0]
initial_location: 0x0
address_range: 0x28 (end : 0x28)
Program:
DW_CFA_advance_loc: 4 to 0x4
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_advance_loc: 4 to 0x8
DW_CFA_def_cfa_offset: +16
DW_CFA_advance_loc: 4 to 0xc
DW_CFA_def_cfa: reg29 +16
DW_CFA_offset: reg30 -8
DW_CFA_offset: reg29 -16
DW_CFA_advance_loc: 16 to 0x1c
DW_CFA_def_cfa: reg31 +16
DW_CFA_advance_loc: 4 to 0x20
DW_CFA_def_cfa_offset: +0
DW_CFA_advance_loc: 4 to 0x24
DW_CFA_AARCH64_negate_ra_state:
DW_CFA_restore: reg30
DW_CFA_restore: reg29
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
Powered by blists - more mailing lists