[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1c496b56-7b82-2f18-ab53-1c5a930a4511@amd.com>
Date: Thu, 23 Mar 2023 14:01:45 +0530
From: "Aithal, Srikanth" <sraithal@....com>
To: Conor Dooley <conor.dooley@...rochip.com>,
Vlastimil Babka <vbabka@...e.cz>
Cc: Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
Linux-Next Mailing List <linux-next@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, lkft-triage@...ts.linaro.org,
Andrew Morton <akpm@...ux-foundation.org>,
Stephen Rothwell <sfr@...b.auug.org.au>, lstoakes@...il.com,
David Hildenbrand <david@...hat.com>,
"Liam R. Howlett" <liam.howlett@...cle.com>, willy@...radead.org,
vernon2gm@...il.com, Arnd Bergmann <arnd@...db.de>,
Anders Roxell <anders.roxell@...aro.org>
Subject: Re: next-20230323: arm64: vma_merge (mm/mmap.c:952 (discriminator 1))
- Unable to handle kernel paging request at virtual address 0000000000100111
-
On 3/23/2023 1:43 PM, Conor Dooley wrote:
> On Thu, Mar 23, 2023 at 08:51:25AM +0100, Vlastimil Babka wrote:
>> On 3/23/23 08:35, Naresh Kamboju wrote:
>>> The following kernel crash was noticed on arm x15, arm64 hikey-6220, Juno-r2,
>>> x86_64 and i386 devices on Linux next-20230323.
>
> To add one more to the sample size, it's falling over on RISC-V too!
>
Its failing on AMD arch, with below trace:
2.510619] BUG: unable to handle page fault for address: 0000000008100111^M
[ 2.513951] #PF: supervisor read access in kernel mode^M
[ 2.521156] usb 3-1.1: New USB device found, idVendor=1604,
idProduct=10c0, bcdDevice= 0.00^M
[ 2.513951] #PF: error_code(0x0000) - not-present page^M
[ 2.530981] usb 3-1.1: New USB device strings: Mfr=0, Product=0,
SerialNumber=0^M
[ 2.513951] PGD 0 P4D 0 ^M
[ 2.513951] Oops: 0000 [#1] PREEMPT SMP NOPTI^M
[ 2.513951] CPU: 95 PID: 868 Comm: modprobe Not tainted
6.3.0-rc3-next-20230323-next-20230323-814642c #1^M
[ 2.513951] Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS
2.8.5 08/18/2022^M
[ 2.513951] RIP: 0010:vma_merge+0xe4/0xc50^M
[ 2.513951] Code: 0f 84 59 08 00 00 48 8b 45 88 49 39 47 08 0f 84 27
02 00 00 4d 85 f6 74 0a 4d 39 6e 08 0f 84 a7 01 00 00 31 c9 48 85 db 74
79 <48> 8b b3 a0 00 00 00 4c 39 e6 0f 84 98 00 00 00 4c 89 e7 88 8d 4f^M
[ 2.577270] hub 3-1.1:1.0: USB hub found^M
[ 2.513951] RSP: 0018:ffffb5e98ec47c88 EFLAGS: 00010206^M
[ 2.513951] RAX: 0000000000000000 RBX: 0000000008100071 RCX:
0000000000000000^M
[ 2.513951] RDX: ffff9857508c2c30 RSI: 0000000000100001 RDI:
ffff985754569870^M
[ 2.513951] RBP: ffffb5e98ec47d40 R08: 00000000000001bb R09:
0000000000000000^M
[ 2.513951] R10: 0000000000000000 R11: ffff98575452ef0c R12:
0000000000000000^M
[ 2.513951] R13: 00007f8be41f4000 R14: ffff985754569870 R15:
ffff985754568958^M
[ 2.594281] hub 3-1.1:1.0: 4 ports detected^M
[ 2.513951] FS: 00007f8be4f64740(0000) GS:ffff987640dc0000(0000)
knlGS:0000000000000000^M
[ 2.513951] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[ 2.513951] CR2: 0000000008100111 CR3: 0008000114566002 CR4:
0000000000770ee0^M
[ 2.513951] PKRU: 55555554^M
[ 2.620194] Call Trace:^M
[ 2.620194] <TASK>^M
[ 2.620194] mprotect_fixup+0x13e/0x320^M
[ 2.620194] do_mprotect_pkey+0x43c/0x4d0^M
[ 2.620194] ? do_user_addr_fault+0x34f/0x8e0^M
[ 2.620194] ? exit_to_user_mode_prepare+0x32/0x190^M
[ 2.620194] __x64_sys_mprotect+0x23/0x30^M
[ 2.688176] usb 3-1.4: new high-speed USB device number 4 using
xhci_hcd^M
[ 2.620194] do_syscall_64+0x3e/0x90^M
[ 2.620194] entry_SYSCALL_64_after_hwframe+0x72/0xdc^M
[ 2.620194] RIP: 0033:0x7f8be4d40ebb^M
[ 2.620194] Code: 73 01 c3 48 8d 0d 2d e3 22 00 f7 d8 89 01 48 83 c8
ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa b8 0a 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8d 0d fd e2 22 00 f7 d8 89 01 48 83^M
[ 2.620194] RSP: 002b:00007fff1da6b298 EFLAGS: 00000206 ORIG_RAX:
000000000000000a^M
[ 2.620194] RAX: ffffffffffffffda RBX: 00007f8be4f6a3d0 RCX:
00007f8be4d40ebb^M
[ 2.620194] RDX: 0000000000000001 RSI: 0000000000004000 RDI:
00007f8be41f4000^M
[ 2.620194] RBP: 00007fff1da6b3c0 R08: 0000000000000000 R09:
00007f8be3e39000^M
[ 2.620194] R10: 00007f8be4f6a3d0 R11: 0000000000000206 R12:
0000000000000000^M
[ 2.620194] R13: 00007f8be41f8018 R14: 00007f8be4f6a3d0 R15:
00007f8be4f6a3d0^M
[ 2.620194] </TASK>^M
[ 2.620194] Modules linked in:^M
[ 2.620194] CR2: 0000000008100111^M
[ 2.620194] ---[ end trace 0000000000000000 ]---^M
[ 2.620194] pstore: backend (erst) writing error (-28)^M
[ 2.854021] usb 3-1.4: New USB device found, idVendor=1604,
idProduct=10c0, bcdDevice= 0.00^M
[ 2.620194] RIP: 0010:vma_merge+0xe4/0xc50^M
>>> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
>>>
>>> crash log on arm64:
>>> ---------------
>>> [ 19.281223] Unable to handle kernel paging request at virtual
>>> address 0000000000100111
>>> [ 19.289189] Mem abort info:
>>> [ 19.291995] ESR = 0x0000000096000006
>>> [ 19.295757] EC = 0x25: DABT (current EL), IL = 32 bits
>>> [ 19.301086] SET = 0, FnV = 0
>>> [ 19.304151] EA = 0, S1PTW = 0
>>> [ 19.307302] FSC = 0x06: level 2 translation fault
>>> [ 19.312194] Data abort info:
>>> [ 19.315083] ISV = 0, ISS = 0x00000006
>>> [ 19.318930] CM = 0, WnR = 0
>>> [ 19.321901] user pgtable: 4k pages, 48-bit VAs, pgdp=00000008a23c5000
>>> [ 19.328374] [0000000000100111] pgd=08000008a14c5003,
>>> p4d=08000008a14c5003, pud=08000008a14c6003, pmd=0000000000000000
>>> [ 19.339037] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
>>> [ 19.345315] Modules linked in:
>>> [ 19.348373] CPU: 2 PID: 1 Comm: init Not tainted 6.3.0-rc3-next-20230323 #1
>>
>> next-20230323 seems to contain v2 of Lorenzo's vma_merge cleanups
>>
>>> [ 19.355347] Hardware name: ARM Juno development board (r2) (DT)
>>> [ 19.361273] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [ 19.368246] pc : vma_merge (mm/mmap.c:952 (discriminator 1))
>>
>> And this is a line involving 'next' and Liam pointed out a possibly
>> unitialized next in v2, so that's probably it.
>> Andrew replaced it with a fixed version so it should make its way to -next
>> as well.
>
> Cool, hopefully it is fixed tomorrow :)
Thanks will keep an eye on it.
Srikanth Aithal
>
> Cheers,
> Conor.
>
>>> [ 19.371917] lr : vma_merge (mm/mmap.c:945)
>>> [ 19.375670] sp : ffff80000b37bb40
>>> [ 19.378985] x29: ffff80000b37bb40 x28: ffff000820c0ff20 x27: 0000000000000000
>>> [ 19.386139] x26: ffff000820c17210 x25: ffff000800bfac00 x24: 0000ffff8e8b7000
>>> [ 19.393293] x23: 0000000000100071 x22: ffff000800898d80 x21: 0000000000100071
>>> [ 19.400446] x20: ffff80000b37bd18 x19: 0000ffff8e8ba000 x18: ffff80000b37bd18
>>> [ 19.407599] x17: 0000000000000000 x16: ffff8000099a58c8 x15: 0000ffff8e9aefff
>>> [ 19.414752] x14: 0000ffff8e8b7000 x13: 1fffe001041bb361 x12: ffff80000b37baf8
>>> [ 19.421905] x11: ffff000822473400 x10: ffff000820dd9b08 x9 : ffff80000830fc64
>>> [ 19.429057] x8 : 0000ffff8e8b7000 x7 : 0000ffff8e8b7000 x6 : ffff000820dd9b50
>>> [ 19.436210] x5 : ffff000820c0ff20 x4 : 0000000000000187 x3 : ffff000800bfac00
>>> [ 19.443363] x2 : 0000000000000000 x1 : 0000000000100071 x0 : 0000000000000000
>>> [ 19.450515] Call trace:
>>> [ 19.452960] vma_merge (mm/mmap.c:952 (discriminator 1))
>>> [ 19.456279] mprotect_fixup (mm/mprotect.c:676)
>>> [ 19.460034] do_mprotect_pkey.constprop.0 (mm/mprotect.c:862)
>>> [ 19.465094] __arm64_sys_mprotect (mm/mprotect.c:880)
>>> [ 19.469283] invoke_syscall (arch/arm64/include/asm/current.h:19
>>> arch/arm64/kernel/syscall.c:57)
>>> [ 19.473041] el0_svc_common (arch/arm64/include/asm/daifflags.h:28
>>> arch/arm64/kernel/syscall.c:150)
>>> [ 19.476796] do_el0_svc (arch/arm64/kernel/syscall.c:194)
>>> [ 19.480117] el0_svc (arch/arm64/include/asm/daifflags.h:28
>>> arch/arm64/kernel/entry-common.c:133
>>> arch/arm64/kernel/entry-common.c:142
>>> arch/arm64/kernel/entry-common.c:638)
>>> [ 19.483177] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:656)
>>> [ 19.487454] el0t_64_sync (arch/arm64/kernel/entry.S:591)
>>> [ 19.491123] Code: eb18001f 54000800 52800002 b40004d7 (f94052e1)
>>> All code
>>> ========
>>> 0:* 1f (bad) <-- trapping instruction
>>> 1: 00 18 add %bl,(%rax)
>>> 3: eb 00 jmp 0x5
>>> 5: 08 00 or %al,(%rax)
>>> 7: 54 push %rsp
>>> 8: 02 00 add (%rax),%al
>>> a: 80 52 d7 04 adcb $0x4,-0x29(%rdx)
>>> e: 00 .byte 0x0
>>> f: b4 e1 mov $0xe1,%ah
>>> 11: 52 push %rdx
>>> 12: 40 f9 rex stc
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>> 0: e1 52 loope 0x54
>>> 2: 40 f9 rex stc
>>
>> Looks like an x86 decodecode of arm64 code :) calling a wrong objdump or
>> something?
>>
>>> [ 19.497226] ---[ end trace 0000000000000000 ]---
>>> [ 19.501883] Kernel panic - not syncing: Attempted to kill init!
>>> exitcode=0x0000000b
>>> [ 19.509551] SMP: stopping secondary CPUs
>>> [ 19.513665] Kernel Offset: disabled
>>> [ 19.517152] CPU features: 0x400002,0c3c0400,0000421b
>>> [ 19.522123] Memory Limit: none
>>> [ 19.525181] ---[ end Kernel panic - not syncing: Attempted to kill
>>> init! exitcode=0x0000000b ]---
>>>
>>>
>>> metadata:
>>> git_ref: master
>>> git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>>> git_sha: 7c4a254d78f89546d0e74a40617ef24c6151c8d1
>>> git_describe: next-20230323
>>> kernel_version: 6.3.0-rc3
>>> kernel-config:
>>> https://storage.tuxsuite.com/public/linaro/lkft/builds/2NOjwfRUa0fjWWZBWCUG4Ypift7/config
>>> build-url: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next/-/pipelines/815177945
>>> artifact-location:
>>> https://storage.tuxsuite.com/public/linaro/lkft/builds/2NOjwfRUa0fjWWZBWCUG4Ypift7
>>> toolchain: gcc-11
>>>
>>>
>>> --
>>> Linaro LKFT
>>> https://lkft.linaro.org
>>>
>>
>>
Powered by blists - more mailing lists