lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 1 Oct 2023 01:26:03 +0900
From:   Hyeonggon Yoo <42.hyeyoo@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        David Kaplan <David.Kaplan@....com>,
        Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org>,
        x86@...nel.org, 42.hyeyoo@...il.com
Subject: Re: Linux 6.6-rc3 (DEBUG_VIRTUAL is unhappy on x86)

On Sun, Sep 24, 2023 at 02:36:21PM -0700, Linus Torvalds wrote:
> Another week, another -rc.
> 
> As usual, rc3 is a bit larger than rc2, as people have started finding
> more issues.
> 
> Unusually, we have a large chunk of changes in filesystems. Part of it
> is the vfs-level revert of some of the timestamp handling that needs
> to soak a bit more, and part of it is some xfs fixes. With a few other
> filesystem fixes too.
> 
> But drivers and architecture updates are also up there, so it's not
> like the fs stuff dominates. It's just more noticeable than it usually
> is.
> 
> Anyway, please do go test. None of this looks scary,
> 
>                  Linus
> 
> ---

[...]

> Peter Zijlstra (1):
>       x86,static_call: Fix static-call vs return-thunk

Hello, the commit above caused a crash on x86 kernel with
CONFIG_DEBUG_VIRTUAL=y.

The compiler version is gcc (GCC) 13.2.1 20230728 (Red Hat 13.2.1-1),
and below are dmesg (raw), dmesg (decoded), git bisect log,
and the configuration used.

I'm not sure if it would lead to an unwelcome surprise, because
vmalloc_to_page(any valid kernel address) should work anyway.
But it seems that by some reason, while updating kernel code,
the kernel confuses kernel text area with vmalloc/module area.

Should be an x86-specific issue.

==== dmesg (raw) ====

On top of commit aee9d30b9744, the log is below.

[    0.242439] ------------[ cut here ]------------
[    0.242840] kernel BUG at mm/vmalloc.c:673!
[    0.243255] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[    0.243837] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.0-rc2+ #60
[    0.243837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[    0.243837] RIP: 0010:vmalloc_to_page+0x28b/0x390
[    0.243837] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd 8
[    0.243837] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[    0.243837] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[    0.243837] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[    0.243837] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[    0.243837] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[    0.243837] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[    0.243837] FS:  0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[    0.243837] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.243837] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[    0.243837] PKRU: 55555554
[    0.243837] Call Trace:
[    0.243837]  <TASK>
[    0.243837]  ? die+0x36/0x90
[    0.243837]  ? do_trap+0xda/0x100
[    0.243837]  ? vmalloc_to_page+0x28b/0x390
[    0.243837]  ? do_error_trap+0x6a/0x90
[    0.243837]  ? vmalloc_to_page+0x28b/0x390
[    0.243837]  ? exc_invalid_op+0x50/0x70
[    0.243837]  ? vmalloc_to_page+0x28b/0x390
[    0.243837]  ? asm_exc_invalid_op+0x1a/0x20
[    0.243837]  ? vmalloc_to_page+0x28b/0x390
[    0.243837]  ? vmalloc_to_page+0x283/0x390
[    0.243837]  __text_poke+0x2d8/0x510
[    0.243837]  ? __pfx_text_poke_memcpy+0x10/0x10
[    0.243837]  ? srso_alias_return_thunk+0x5/0x7f
[    0.243837]  ? text_poke_loc_init+0x78/0x1e0
[    0.243837]  text_poke_bp_batch+0x91/0x300
[    0.243837]  text_poke_bp+0x4f/0x70
[    0.243837]  __static_call_transform+0xc0/0x200
[    0.243837]  arch_static_call_transform+0x83/0xa0
[    0.243837]  __static_call_init+0x20e/0x280
[    0.243837]  ? __pfx_static_call_init+0x10/0x10
[    0.243837]  static_call_init+0x39/0xa0
[    0.243837]  ? __pfx_static_call_init+0x10/0x10
[    0.243837]  do_one_initcall+0x5d/0x320
[    0.243837]  kernel_init_freeable+0x231/0x470
[    0.243837]  ? __pfx_kernel_init+0x10/0x10
[    0.243837]  kernel_init+0x1a/0x1c0
[    0.243837]  ret_from_fork+0x34/0x50
[    0.243837]  ? __pfx_kernel_init+0x10/0x10
[    0.243837]  ret_from_fork_asm+0x1b/0x30
[    0.243837]  </TASK>
[    0.243837] Modules linked in:
[    0.243841] ---[ end trace 0000000000000000 ]---
[    0.244395] RIP: 0010:vmalloc_to_page+0x28b/0x390
[    0.244840] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd 8
[    0.245840] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[    0.246349] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[    0.246839] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[    0.247516] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[    0.247839] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[    0.248522] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[    0.248840] FS:  0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[    0.249611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.249839] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[    0.250522] PKRU: 55555554
[    0.250792] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.250837] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

==== dmesg (decoded) ====

[    0.242439] ------------[ cut here ]------------
[    0.242840] kernel BUG at mm/vmalloc.c:673!
[    0.243255] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[    0.243837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[    0.243837] RIP: 0010:vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[ 0.243837] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd ff ff 84 c0 0f 85 b2 fd ff ff <0f> 0b 48 81 e1 00 00 00 c0 e9 ea fe ff ff 0f 0b e9 ab fd ff ff 48
All code
========
   0:	31 d0                	xor    %edx,%eax
   2:	48 23 05 5e d0 5e 01 	and    0x15ed05e(%rip),%rax        # 0x15ed067
   9:	48 c1 e8 0c          	shr    $0xc,%rax
   d:	48 c1 e0 06          	shl    $0x6,%rax
  11:	48 03 05 1f a5 5d 01 	add    0x15da51f(%rip),%rax        # 0x15da537
  18:	e9 cf fd ff ff       	jmp    0xfffffffffffffdec
  1d:	e8 4d dd ff ff       	call   0xffffffffffffdd6f
  22:	84 c0                	test   %al,%al
  24:	0f 85 b2 fd ff ff    	jne    0xfffffffffffffddc
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	48 81 e1 00 00 00 c0 	and    $0xffffffffc0000000,%rcx
  33:	e9 ea fe ff ff       	jmp    0xffffffffffffff22
  38:	0f 0b                	ud2
  3a:	e9 ab fd ff ff       	jmp    0xfffffffffffffdea
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	48 81 e1 00 00 00 c0 	and    $0xffffffffc0000000,%rcx
   9:	e9 ea fe ff ff       	jmp    0xfffffffffffffef8
   e:	0f 0b                	ud2
  10:	e9 ab fd ff ff       	jmp    0xfffffffffffffdc0
  15:	48                   	rex.W
[    0.243837] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[    0.243837] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[    0.243837] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[    0.243837] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[    0.243837] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[    0.243837] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[    0.243837] FS:  0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[    0.243837] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.243837] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[    0.243837] PKRU: 55555554
[    0.243837] Call Trace:
[    0.243837]  <TASK>
[    0.243837] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) 
[    0.243837] ? do_trap (arch/x86/kernel/traps.c:112 arch/x86/kernel/traps.c:153) 
[    0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[    0.243837] ? do_error_trap (./arch/x86/include/asm/traps.h:59 arch/x86/kernel/traps.c:174) 
[    0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[    0.243837] ? exc_invalid_op (arch/x86/kernel/traps.c:265) 
[    0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[    0.243837] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) 
[    0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[    0.243837] ? vmalloc_to_page (mm/vmalloc.c:673 (discriminator 2)) 
[    0.243837] __text_poke (arch/x86/kernel/alternative.c:1783) 
[    0.243837] ? __pfx_text_poke_memcpy (arch/x86/kernel/alternative.c:1753) 
[    0.243837] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:186) 
[    0.243837] ? text_poke_loc_init (arch/x86/kernel/alternative.c:2308 (discriminator 1)) 
[    0.243837] text_poke_bp_batch (arch/x86/kernel/alternative.c:2198 (discriminator 1)) 
[    0.243837] text_poke_bp (arch/x86/kernel/alternative.c:2431) 
[    0.243837] __static_call_transform (arch/x86/kernel/static_call.c:112) 
[    0.243837] arch_static_call_transform (arch/x86/kernel/static_call.c:172) 
[    0.243837] __static_call_init (kernel/static_call_inline.c:233 (discriminator 1)) 
[    0.243837] ? __pfx_static_call_init (kernel/static_call_inline.c:486) 
[    0.243837] static_call_init (kernel/static_call_inline.c:41 kernel/static_call_inline.c:497) 
[    0.243837] ? __pfx_static_call_init (kernel/static_call_inline.c:486) 
[    0.243837] do_one_initcall (init/main.c:1232) 
[    0.243837] kernel_init_freeable (init/main.c:1337 (discriminator 1) init/main.c:1537 (discriminator 1)) 
[    0.243837] ? __pfx_kernel_init (init/main.c:1429) 
[    0.243837] kernel_init (init/main.c:1439) 
[    0.243837] ret_from_fork (arch/x86/kernel/process.c:153) 
[    0.243837] ? __pfx_kernel_init (init/main.c:1429) 
[    0.243837] ret_from_fork_asm (arch/x86/entry/entry_64.S:312) 
[    0.243837]  </TASK>
[    0.243837] Modules linked in:
[    0.243841] ---[ end trace 0000000000000000 ]---
[    0.244395] RIP: 0010:vmalloc_to_page (mm/vmalloc.c:673 (discriminator 1)) 
[ 0.244840] Code: 31 d0 48 23 05 5e d0 5e 01 48 c1 e8 0c 48 c1 e0 06 48 03 05 1f a5 5d 01 e9 cf fd ff ff e8 4d dd ff ff 84 c0 0f 85 b2 fd ff ff <0f> 0b 48 81 e1 00 00 00 c0 e9 ea fe ff ff 0f 0b e9 ab fd ff ff 48
All code
========
   0:	31 d0                	xor    %edx,%eax
   2:	48 23 05 5e d0 5e 01 	and    0x15ed05e(%rip),%rax        # 0x15ed067
   9:	48 c1 e8 0c          	shr    $0xc,%rax
   d:	48 c1 e0 06          	shl    $0x6,%rax
  11:	48 03 05 1f a5 5d 01 	add    0x15da51f(%rip),%rax        # 0x15da537
  18:	e9 cf fd ff ff       	jmp    0xfffffffffffffdec
  1d:	e8 4d dd ff ff       	call   0xffffffffffffdd6f
  22:	84 c0                	test   %al,%al
  24:	0f 85 b2 fd ff ff    	jne    0xfffffffffffffddc
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	48 81 e1 00 00 00 c0 	and    $0xffffffffc0000000,%rcx
  33:	e9 ea fe ff ff       	jmp    0xffffffffffffff22
  38:	0f 0b                	ud2
  3a:	e9 ab fd ff ff       	jmp    0xfffffffffffffdea
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	48 81 e1 00 00 00 c0 	and    $0xffffffffc0000000,%rcx
   9:	e9 ea fe ff ff       	jmp    0xfffffffffffffef8
   e:	0f 0b                	ud2
  10:	e9 ab fd ff ff       	jmp    0xfffffffffffffdc0
  15:	48                   	rex.W
[    0.245840] RSP: 0018:ffffc90000013c68 EFLAGS: 00010246
[    0.246349] RAX: ffffe8ffffffff00 RBX: ffffffff83ce1124 RCX: 0000000000000027
[    0.246839] RDX: ffffc90000000000 RSI: ffffffff83ce1124 RDI: ffffffff83ce1124
[    0.247516] RBP: ffffffff83020ff8 R08: 000000000000000f R09: ffffffff83cff7e5
[    0.247839] R10: ffffffff83cff7e4 R11: ffffc90000013d6a R12: ffffc90000013d70
[    0.248522] R13: 0000000000000125 R14: 0000000000000000 R15: ffffffff8321fef0
[    0.248840] FS:  0000000000000000(0000) GS:ffff88813b400000(0000) knlGS:0000000000000000
[    0.249611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.249839] CR2: ffff88813f9ff000 CR3: 0000000003020000 CR4: 0000000000750ef0
[    0.250522] PKRU: 55555554
[    0.250792] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.250837] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

==== git bisect log ====

$ git bisect log
git bisect start
# status: waiting for both good and bad commits
# bad: [df964ce9ef9fea10cf131bf6bad8658fde7956f6] Add linux-next specific files for 20230929
git bisect bad df964ce9ef9fea10cf131bf6bad8658fde7956f6
# status: waiting for good commit(s), bad commit known
# bad: [9ed22ae6be817d7a3f5c15ca22cbc9d3963b481d] Merge tag 'spi-fix-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/ki
git bisect bad 9ed22ae6be817d7a3f5c15ca22cbc9d3963b481d
# status: waiting for good commit(s), bad commit known
# bad: [6465e260f48790807eef06b583b38ca9789b6072] Linux 6.6-rc3
git bisect bad 6465e260f48790807eef06b583b38ca9789b6072
# status: waiting for good commit(s), bad commit known
# good: [0bb80ecc33a8fb5a682236443c1e740d5c917d1d] Linux 6.6-rc1
git bisect good 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
# good: [ce9ecca0238b140b88f43859b211c9fdfd8e5b70] Linux 6.6-rc2
git bisect good ce9ecca0238b140b88f43859b211c9fdfd8e5b70
# good: [27bbf45eae9ca98877a2d52a92a188147cd61b07] Merge tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernet
git bisect good 27bbf45eae9ca98877a2d52a92a188147cd61b07
# bad: [3abc79dce60e91f2aeec8abf1d09b250722fbeb5] Merge tag 'xfs-6.6-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xx
git bisect bad 3abc79dce60e91f2aeec8abf1d09b250722fbeb5
# good: [e583bffeb8bc3a7b64455b14376afd5fad71d62f] Merge tag 'x86-urgent-2023-09-22' of git://git.kernel.org/pub/scm/lp
git bisect good e583bffeb8bc3a7b64455b14376afd5fad71d62f
# good: [6ebb6500e54631b7013f4efe7d78ff562e437c5e] Merge tag 'fix-larp-requirements-6.6_2023-09-12' of https://git.kerA
git bisect good 6ebb6500e54631b7013f4efe7d78ff562e437c5e
# bad: [36fcf38152d8f163850831d52199adea4d6d9518] Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernelx
git bisect bad 36fcf38152d8f163850831d52199adea4d6d9518
# good: [5ad361f42fe43e5f13f9b88341e75eaf2d1bd183] arm64/hbc: Document HWCAP2_HBC
git bisect good 5ad361f42fe43e5f13f9b88341e75eaf2d1bd183
# bad: [aee9d30b9744d677509ef790f30f3a24c7841c3d] x86,static_call: Fix static-call vs return-thunk
git bisect bad aee9d30b9744d677509ef790f30f3a24c7841c3d
# good: [4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124] x86/alternatives: Remove faulty optimization
git bisect good 4ba89dd6ddeca2a733bdaed7c9a5cbe4e19d9124
# first bad commit: [aee9d30b9744d677509ef790f30f3a24c7841c3d] x86,static_call: Fix static-call vs return-thunk


View attachment ".config" of type "text/plain" (202108 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ