[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRhKq6e5nF/4ZIV1@fedora>
Date: Sun, 1 Oct 2023 01:26:03 +0900
From: Hyeonggon Yoo <42.hyeyoo@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
David Kaplan <David.Kaplan@....com>,
Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org>,
x86@...nel.org, 42.hyeyoo@...il.com
Subject: Re: Linux 6.6-rc3 (DEBUG_VIRTUAL is unhappy on x86)
On Sun, Sep 24, 2023 at 02:36:21PM -0700, Linus Torvalds wrote:
> Another week, another -rc.
>
> As usual, rc3 is a bit larger than rc2, as people have started finding
> more issues.
>
> Unusually, we have a large chunk of changes in filesystems. Part of it
> is the vfs-level revert of some of the timestamp handling that needs
> to soak a bit more, and part of it is some xfs fixes. With a few other
> filesystem fixes too.
>
> But drivers and architecture updates are also up there, so it's not
> like the fs stuff dominates. It's just more noticeable than it usually
> is.
>
> Anyway, please do go test. None of this looks scary,
>
> Linus
>
> ---
[...]
> Peter Zijlstra (1):
> x86,static_call: Fix static-call vs return-thunk
Hello, the commit above caused a crash on x86 kernel with
CONFIG_DEBUG_VIRTUAL=y.
The compiler version is gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1),
and below are dmesg (raw), dmesg (decoded), git bisect log,
and the configuration used.
I'm not sure if it would lead to an unwelcome surprise, because
vmalloc_to_page(any valid kernel address) should work anyway.
But it seems that by some reason, while updating kernel code,
the kernel confuses kernel text area with vmalloc/module area.
Should be an x86-specific issue.
==== dmesg (raw) ====
On top of commit aee9d30b9744, the log is below.
[ 0.242439] ------------[ cut here ]------------
[ 0.242840] kernel BUG at mm/vmalloc.c:673!
[ 0.243255] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[ 0.243837] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.0-rc2+ #60
[ 0.243837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[ 0.243837] RIP: 0010:vmalloc_to_page+0x28b/0x390
[ 0.243837] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd 8
[ 0.243837] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[ 0.243837] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[ 0.243837] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[ 0.243837] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[ 0.243837] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[ 0.243837] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[ 0.243837] FS: 0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[ 0.243837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.243837] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[ 0.243837] PKRU: 55555554
[ 0.243837] Call Trace:
[ 0.243837] <TASK>
[ 0.243837] ? die+0x36/0x90
[ 0.243837] ? do_trap+0xda/0x100
[ 0.243837] ? vmalloc_to_page+0x28b/0x390
[ 0.243837] ? do_error_trap+0x6a/0x90
[ 0.243837] ? vmalloc_to_page+0x28b/0x390
[ 0.243837] ? exc_invalid_op+0x50/0x70
[ 0.243837] ? vmalloc_to_page+0x28b/0x390
[ 0.243837] ? asm_exc_invalid_op+0x1a/0x20
[ 0.243837] ? vmalloc_to_page+0x28b/0x390
[ 0.243837] ? vmalloc_to_page+0x283/0x390
[ 0.243837] __text_poke+0x2d8/0x510
[ 0.243837] ? __pfx_text_poke_memcpy+0x10/0x10
[ 0.243837] ? srso_alias_return_thunk+0x5/0x7f
[ 0.243837] ? text_poke_loc_init+0x78/0x1e0
[ 0.243837] text_poke_bp_batch+0x91/0x300
[ 0.243837] text_poke_bp+0x4f/0x70
[ 0.243837] __static_call_transform+0xc0/0x200
[ 0.243837] arch_static_call_transform+0x83/0xa0
[ 0.243837] __static_call_init+0x20e/0x280
[ 0.243837] ? __pfx_static_call_init+0x10/0x10
[ 0.243837] static_call_init+0x39/0xa0
[ 0.243837] ? __pfx_static_call_init+0x10/0x10
[ 0.243837] do_one_initcall+0x5d/0x320
[ 0.243837] kernel_init_freeable+0x231/0x470
[ 0.243837] ? __pfx_kernel_init+0x10/0x10
[ 0.243837] kernel_init+0x1a/0x1c0
[ 0.243837] ret_from_fork+0x34/0x50
[ 0.243837] ? __pfx_kernel_init+0x10/0x10
[ 0.243837] ret_from_fork_asm+0x1b/0x30
[ 0.243837] </TASK>
[ 0.243837] Modules linked in:
[ 0.243841] ---[ end trace 0000000000000000 ]---
[ 0.244395] RIP: 0010:vmalloc_to_page+0x28b/0x390
[ 0.244840] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd 8
[ 0.245840] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[ 0.246349] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[ 0.246839] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[ 0.247516] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[ 0.247839] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[ 0.248522] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[ 0.248840] FS: 0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[ 0.249611] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.249839] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[ 0.250522] PKRU: 55555554
[ 0.250792] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.250837] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
==== dmesg (decoded) ====
[ 0.242439] ------------[ cut here ]------------
[ 0.242840] kernel BUG at mm/vmalloc.c:673!
[ 0.243255] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[ 0.243837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[ 0.243837] RIP: 0010:vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.243837] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd ff ff 84 c0 0f 85 b2 fd ff ff <0f> 0b 48 81 e1 00 00 00 c0 e9 ea fe ff ff 0f 0b e9 ab fd ff ff 48
All code
========
0: 31 d0 xor %edx,%eax
2: 48 23 05 5e d0 5e 01 and 0x15ed05e(%rip),%rax # 0x15ed067
9: 48 c1 e8 0c shr $0xc,%rax
d: 48 c1 e0 06 shl $0x6,%rax
11: 48 03 05 1f a5 5d 01 add 0x15da51f(%rip),%rax # 0x15da537
18: e9 cf fd ff ff jmp 0xfffffffffffffdec
1d: e8 4d dd ff ff call 0xffffffffffffdd6f
22: 84 c0 test %al,%al
24: 0f 85 b2 fd ff ff jne 0xfffffffffffffddc
2a:* 0f 0b ud2 <-- trapping instruction
2c: 48 81 e1 00 00 00 c0 and $0xffffffffc0000000,%rcx
33: e9 ea fe ff ff jmp 0xffffffffffffff22
38: 0f 0b ud2
3a: e9 ab fd ff ff jmp 0xfffffffffffffdea
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 48 81 e1 00 00 00 c0 and $0xffffffffc0000000,%rcx
9: e9 ea fe ff ff jmp 0xfffffffffffffef8
e: 0f 0b ud2
10: e9 ab fd ff ff jmp 0xfffffffffffffdc0
15: 48 rex.W
[ 0.243837] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[ 0.243837] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[ 0.243837] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[ 0.243837] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[ 0.243837] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[ 0.243837] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[ 0.243837] FS: 0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[ 0.243837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.243837] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[ 0.243837] PKRU: 55555554
[ 0.243837] Call Trace:
[ 0.243837] <TASK>
[ 0.243837] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
[ 0.243837] ? do_trap (arch/x86/kernel/traps.c:112 arch/x86/kernel/traps.c:153)
[ 0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.243837] ? do_error_trap (./arch/x86/include/asm/traps.h:59 arch/x86/kernel/traps.c:174)
[ 0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.243837] ? exc_invalid_op (arch/x86/kernel/traps.c:265)
[ 0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.243837] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 2))
[ 0.243837] __text_poke (arch/x86/kernel/alternative.c:1783)
[ 0.243837] ? __pfx_text_poke_memcpy (arch/x86/kernel/alternative.c:1753)
[ 0.243837] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:186)
[ 0.243837] ? text_poke_loc_init (arch/x86/kernel/alternative.c:2308 (discriminator 1))
[ 0.243837] text_poke_bp_batch (arch/x86/kernel/alternative.c:2198 (discriminator 1))
[ 0.243837] text_poke_bp (arch/x86/kernel/alternative.c:2431)
[ 0.243837] __static_call_transform (arch/x86/kernel/static_call.c:112)
[ 0.243837] arch_static_call_transform (arch/x86/kernel/static_call.c:172)
[ 0.243837] __static_call_init (kernel/static_call_inline.c:233 (discriminator 1))
[ 0.243837] ? __pfx_static_call_init (kernel/static_call_inline.c:486)
[ 0.243837] static_call_init (kernel/static_call_inline.c:41 kernel/static_call_inline.c:497)
[ 0.243837] ? __pfx_static_call_init (kernel/static_call_inline.c:486)
[ 0.243837] do_one_initcall (init/main.c:1232)
[ 0.243837] kernel_init_freeable (init/main.c:1337 (discriminator 1) init/main.c:1537 (discriminator 1))
[ 0.243837] ? __pfx_kernel_init (init/main.c:1429)
[ 0.243837] kernel_init (init/main.c:1439)
[ 0.243837] ret_from_fork (arch/x86/kernel/process.c:153)
[ 0.243837] ? __pfx_kernel_init (init/main.c:1429)
[ 0.243837] ret_from_fork_asm (arch/x86/entry/entry_64.S:312)
[ 0.243837] </TASK>
[ 0.243837] Modules linked in:
[ 0.243841] ---[ end trace 0000000000000000 ]---
[ 0.244395] RIP: 0010:vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1))
[ 0.244840] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd ff ff 84 c0 0f 85 b2 fd ff ff <0f> 0b 48 81 e1 00 00 00 c0 e9 ea fe ff ff 0f 0b e9 ab fd ff ff 48
All code
========
0: 31 d0 xor %edx,%eax
2: 48 23 05 5e d0 5e 01 and 0x15ed05e(%rip),%rax # 0x15ed067
9: 48 c1 e8 0c shr $0xc,%rax
d: 48 c1 e0 06 shl $0x6,%rax
11: 48 03 05 1f a5 5d 01 add 0x15da51f(%rip),%rax # 0x15da537
18: e9 cf fd ff ff jmp 0xfffffffffffffdec
1d: e8 4d dd ff ff call 0xffffffffffffdd6f
22: 84 c0 test %al,%al
24: 0f 85 b2 fd ff ff jne 0xfffffffffffffddc
2a:* 0f 0b ud2 <-- trapping instruction
2c: 48 81 e1 00 00 00 c0 and $0xffffffffc0000000,%rcx
33: e9 ea fe ff ff jmp 0xffffffffffffff22
38: 0f 0b ud2
3a: e9 ab fd ff ff jmp 0xfffffffffffffdea
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 48 81 e1 00 00 00 c0 and $0xffffffffc0000000,%rcx
9: e9 ea fe ff ff jmp 0xfffffffffffffef8
e: 0f 0b ud2
10: e9 ab fd ff ff jmp 0xfffffffffffffdc0
15: 48 rex.W
[ 0.245840] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[ 0.246349] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[ 0.246839] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[ 0.247516] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[ 0.247839] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[ 0.248522] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[ 0.248840] FS: 0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[ 0.249611] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.249839] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[ 0.250522] PKRU: 55555554
[ 0.250792] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 0.250837] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
==== git bisect log ====
$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [df964ce9ef9fea10cf131bf6bad8658fde7956f6] Add linux-next specific files for 20230929
git bisect bad df964ce9ef9fea10cf131bf6bad8658fde7956f6
# status: waiting for good commit(s), bad commit known
# bad: [9ed22ae6be817d7a3f5c15ca22cbc9d3963b481d] Merge tag 'spi-fix-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/ki
git bisect bad 9ed22ae6be817d7a3f5c15ca22cbc9d3963b481d
# status: waiting for good commit(s), bad commit known
# bad: [6465e260f48790807eef06b583b38ca9789b6072] Linux 6.6-rc3
git bisect bad 6465e260f48790807eef06b583b38ca9789b6072
# status: waiting for good commit(s), bad commit known
# good: [0bb80ecc33a8fb5a682236443c1e740d5c917d1d] Linux 6.6-rc1
git bisect good 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
# good: [ce9ecca0238b140b88f43859b211c9fdfd8e5b70] Linux 6.6-rc2
git bisect good ce9ecca0238b140b88f43859b211c9fdfd8e5b70
# good: [27bbf45eae9ca98877a2d52a92a188147cd61b07] Merge tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernet
git bisect good 27bbf45eae9ca98877a2d52a92a188147cd61b07
# bad: [3abc79dce60e91f2aeec8abf1d09b250722fbeb5] Merge tag 'xfs-6.6-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xx
git bisect bad 3abc79dce60e91f2aeec8abf1d09b250722fbeb5
# good: [e583bffeb8bc3a7b64455b14376afd5fad71d62f] Merge tag 'x86-urgent-2023-09-22' of git://git.kernel.org/pub/scm/lp
git bisect good e583bffeb8bc3a7b64455b14376afd5fad71d62f
# good: [6ebb6500e54631b7013f4efe7d78ff562e437c5e] Merge tag 'fix-larp-requirements-6.6_2023-09-12' of https://git.kerA
git bisect good 6ebb6500e54631b7013f4efe7d78ff562e437c5e
# bad: [36fcf38152d8f163850831d52199adea4d6d9518] Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernelx
git bisect bad 36fcf38152d8f163850831d52199adea4d6d9518
# good: [5ad361f42fe43e5f13f9b88341e75eaf2d1bd183] arm64/hbc: Document HWCAP2_HBC
git bisect good 5ad361f42fe43e5f13f9b88341e75eaf2d1bd183
# bad: [aee9d30b9744d677509ef790f30f3a24c7841c3d] x86,static_call: Fix static-call vs return-thunk
git bisect bad aee9d30b9744d677509ef790f30f3a24c7841c3d
# good: [4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124] x86/alternatives: Remove faulty optimization
git bisect good 4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124
# first bad commit: [aee9d30b9744d677509ef790f30f3a24c7841c3d] x86,static_call: Fix static-call vs return-thunk
View attachment ".config" of type "text/plain" (202108 bytes)
Powered by blists - more mailing lists