linux-kernel - Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXFKzi14UCoiDOMwS5jyNz61_UzxGXm+ke0EWEt4nn6E1g@mail.gmail.com>
Date:   Wed, 20 Apr 2022 09:31:29 +0200
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Naresh Kamboju <naresh.kamboju@...aro.org>
Cc:     Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        open list <linux-kernel@...r.kernel.org>,
        Linux-Next Mailing List <linux-next@...r.kernel.org>,
        lkft-triage@...ts.linaro.org,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        Russell King - ARM Linux <linux@...linux.org.uk>,
        Arnd Bergmann <arnd@...db.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        max.krummenacher@...adex.com, Shawn Guo <shawnguo@...nel.org>,
        Stefano Stabellini <stefano.stabellini@...inx.com>,
        Christoph Hellwig <hch@....de>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [next] arm: boot failed - PC is at cpu_ca15_set_pte_ext

On Tue, 19 Apr 2022 at 12:59, Naresh Kamboju <naresh.kamboju@...aro.org> wrote:
>
> Linux next 20220419 boot failed on arm architecture qemu_arm and BeagleBoard
> x15 device.
>
> kernel crash log from x15:
> -----------------
> [    6.866516] 8<--- cut here ---
> [    6.869598] Unable to handle kernel paging request at virtual
> address f000e62c
> [    6.876861] [f000e62c] *pgd=82935811, *pte=00000000, *ppte=00000000
> [    6.883209] Internal error: Oops: 807 [#3] SMP ARM
> [    6.888000] Modules linked in:
> [    6.891082] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G      D W
>   5.18.0-rc3-next-20220419 #1
> [    6.899993] Hardware name: Generic DRA74X (Flattened Device Tree)
> [    6.906127] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> [    6.911041] LR is at handle_mm_fault+0x60c/0xed0
> [    6.915679] pc : [<c031f26c>]    lr : [<c04cfeb8>]    psr: 40000013
> [    6.921966] sp : f000dde8  ip : f000de44  fp : a0000013
> [    6.927215] r10: 00000000  r9 : 00000000  r8 : c1e95194
> [    6.932464] r7 : c3c95000  r6 : befffff1  r5 : 00000081  r4 : c29d8000
> [    6.939025] r3 : 00000000  r2 : 00000000  r1 : 00000040  r0 : f000de2c
> [    6.945587] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [    6.952758] Control: 10c5387d  Table: 8020406a  DAC: 00000051
> [    6.958526] Register r0 information: 2-page vmalloc region starting
> at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    6.969299] Register r1 information: non-paged memory
> [    6.974365] Register r2 information: NULL pointer
> [    6.979095] Register r3 information: NULL pointer
> [    6.983825] Register r4 information: slab task_struct start
> c29d8000 pointer offset 0
> [    6.991729] Register r5 information: non-paged memory
> [    6.996795] Register r6 information: non-paged memory
> [    7.001861] Register r7 information: slab vm_area_struct start
> c3c95000 pointer offset 0
> [    7.010009] Register r8 information: non-slab/vmalloc memory
> [    7.015716] Register r9 information: NULL pointer
> [    7.020446] Register r10 information: NULL pointer
> [    7.025238] Register r11 information: non-paged memory
> [    7.030426] Register r12 information: 2-page vmalloc region
> starting at 0xf000c000 allocated at kernel_clone+0x94/0x3b0
> [    7.041259] Process swapper/0 (pid: 1, stack limit = 0xfaff0077)
> [    7.047302] Stack: (0xf000dde8 to 0xf000e000)
> [    7.051696] dde0:                   c29d8000 00000cc0 c20a1108
> c2065fa0 c1e09f50 b6db6db7
> [    7.059906] de00: c195bf0c 17c0f572 c29d8000 c3c95000 00000cc0
> 000befff befff000 befffff1
> [    7.068115] de20: 00000081 c3c3afb8 c3c3afb8 00000000 00000000
> 00000000 00000000 00000000
> [    7.076324] de40: 00000000 17c0f572 befff000 c3c95000 00002017
> befffff1 00002017 00002fb8
> [    7.084564] de60: c2d04000 00000081 c29d8000 c04c6790 c20d01d4
> 00000000 00000001 c20ce440
> [    7.092773] de80: c1e10bcc fffff000 00000000 c2a45680 eeb33cc0
> c29d8000 00000000 c2d04000
> [    7.100982] dea0: befffff1 f000df18 00000000 00002017 c20661a0
> c04c77e8 f000df18 00000000
> [    7.109222] dec0: 00000000 c1d95c40 00000002 c20661e0 00000000
> 00000001 00000000 c04c7ad0
> [    7.117431] dee0: 00000011 c2d02a00 00000001 befffff1 c29d8000
> 00000000 00000011 c2a30010
> [    7.125640] df00: c29d8000 c0524c24 f000df18 00000000 00000000
> 2cd9e000 c1d95c40 17c0f572
> [    7.133850] df20: 00000000 c2d02a00 0000000b 00000ffc 00000000
> befffff1 00000000 c0524f74
> [    7.142089] df40: c1e0e394 c2d02a00 c209a71c 38e38e39 c29d8000
> bee00008 c2d02a00 c2a30000
> [    7.150299] df60: c1e0e394 c1e0e420 00000000 00000000 00000000
> c05266bc c209a000 c1944c60
> [    7.158508] df80: 00000000 00000000 00000000 c129d2b4 c209a000
> c1e0e394 00000000 c12b5600
> [    7.166748] dfa0: 00000000 c12b5518 00000000 c0300168 00000000
> 00000000 00000000 00000000
> [    7.174957] dfc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [    7.183166] dfe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [    7.191406] Code: 13110001 12211b02 13110b02 03a03000 (e5a03800)

This decodes to

   0: 13110001 tstne r1, #1
   4: 12211b02 eorne r1, r1, #2048 ; 0x800
   8: 13110b02 tstne r1, #2048 ; 0x800
   c: 03a03000 moveq r3, #0
  10:* e5a03800 str r3, [r0, #2048]! ; 0x800 <-- trapping instruction

and R0 points into the stack. So we are updating a PTE that is located
on the stack rather than in a page table somewhere, which seems very
odd. However, this could be a latent bug that got uncovered by the
VMAP stacks changes.

Unfortunately, the vmlinux.xz file I downloaded from the link below
seems to be different from the one that produced the crash, given that
the LR address of c04cfeb8 does not seem to correspond with
handle_mm_fault+0x60c/0xed0.

Can you please double check the artifacts?



> metadata:
>   git_ref: master
>   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>   git_sha: 634de1db0e9bbeb90d7b01020e59ec3dab4d38a1
>   git_describe: next-20220419
>   kernel-config: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/config
>   System.map:  https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/System.map
>   vmlinux.xz: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R/vmlinux.xz
>   build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/519362851
>   build: https://builds.tuxbuild.com/280TXP6P7tIBfnowvFY4wobXp3R
>   toolchain: gcc-10
>
> --
> Linaro LKFT
> https://lkft.linaro.org
>
> [1] https://lkft.validation.linaro.org/scheduler/job/4921995#L2616
> [2] https://lkft.validation.linaro.org/scheduler/job/4922061#L552