[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aNPxLQBxUau-FWtj@google.com>
Date: Wed, 24 Sep 2025 06:25:01 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Srikanth Aithal <sraithal@....com>
Cc: Linux-Next Mailing List <linux-next@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>,
KVM <kvm@...r.kernel.org>, Ashish Kalra <Ashish.Kalra@....com>,
Ard Biesheuvel <ardb@...nel.org>, Borislav Petkov <bp@...en8.de>, Tom Lendacky <thomas.lendacky@....com>
Subject: Re: AMD SNP guest kdump broken since linuxnext-20250908
+Ard and Boris (and Tom for good measure)
On Wed, Sep 24, 2025, Srikanth Aithal wrote:
> Hello all,
>
> kdump on an SNP guest is broken in linux-next, starting with next-20250908 [1].
>
> kdump on an SNP guest works with the following kernels as the guest kernel:
>
> 1. https://git.kernel.org/pub/scm/virt/kvm/kvm.git, kvm/next
> 2. git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git next-20250905
> 3. git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git v6.17-rc7
>
> The crash log during kdump varies each time. I have attached all variants of
> the error console logs to this bug report as files, as they are too large to
> include here.
>
> kdump with other guest types (normal, SEV, SEV-ES) is working fine.
>
> I attempted bisecting multiple times, but due to varying error console
> messages—sometimes with a call trace, sometimes just a hang with no error
> messages, and sometimes with extensive register dumps including KVM hardware
> error messages—I had no success until now. Additionally, a couple of
> linux-next bisect attempt pointed to a merge commit where the parent commits
> had no issues, suggesting a possible merge problem.
>
> I am also attaching the host kernel config and guest kernel config used for
> these tests.
>
> Tests were conducted with the following component versions:
>
> * Host kernel: next-20250919
> * QEMU version: v10.1.0
> * EDK2: edk2-stable202508
> * Platform: Milan with the latest BIOS v2.20
>
>
> Thank you,
>
> Srikanth Aithal <Srikanth.Aithal@....com>
>
> root@...ntu:~# echo c > /proc/sysrq-trigger
> [ 26.686014] sysrq: Trigger a crash
> [ 26.687006] Kernel panic - not syncing: sysrq triggered crash
> [ 26.688594] CPU: 0 UID: 0 PID: 4235 Comm: bash Kdump: loaded Not tainted 6.17.0-rc7-next-20250923ce7f1a983b07 #1 PREEMPT(voluntary)
> [ 26.691788] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
> [ 26.693957] Call Trace:
> [ 26.694681] <TASK>
> [ 26.695320] vpanic+0x307/0x360
> [ 26.696237] panic+0x52/0x60
> [ 26.697065] sysrq_handle_crash+0x11/0x20
> [ 26.698177] __handle_sysrq+0xb6/0x170
> [ 26.699220] write_sysrq_trigger+0x50/0x70
> [ 26.700358] proc_reg_write+0x50/0x90
> [ 26.701395] ? preempt_count_add+0x42/0xa0
> [ 26.702531] vfs_write+0xf4/0x430
> [ 26.703481] ? handle_mm_fault+0xd0/0x200
> [ 26.704602] ksys_write+0x5c/0xd0
> [ 26.705551] do_syscall_64+0x4c/0x200
> [ 26.706577] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 26.707961] RIP: 0033:0x7f4cb8024574
> [ 26.708974] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d d5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
> [ 26.713912] RSP: 002b:00007ffdad4f3208 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> [ 26.715976] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f4cb8024574
> [ 26.717905] RDX: 0000000000000002 RSI: 0000564731e37b80 RDI: 0000000000000001
> [ 26.719843] RBP: 00007ffdad4f3230 R08: 0000000000000073 R09: 0000000000000000
> [ 26.721797] R10: 00000000ffffffff R11: 0000000000000202 R12: 0000000000000002
> [ 26.723715] R13: 0000564731e37b80 R14: 00007f4cb810c5c0 R15: 00007f4cb8109ee0
> [ 26.725658] </TASK>
>
> [1373710140.379273] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
> [2800084354.542901] BUG: unable to handle page fault for address: ffffffff9a91e731
> [15541331571.597940] #PF: supervisor instruction fetch in kernel mode
> [11262208929.107056] #PF: error_code(0x0011) - permissions violation
> [15541331571.597940] PGD 800000e045067 P4D 800000e045067 PUD 800000e046063 PMD 80000021b8063 PTE 800800000e91e163
This is definitely a valid (i.e. not corrupted), NX mapping.
> [1373710140.379273] Oops: Oops: 0011 [#1] SMP NOPTI
> [11262208929.107056] CPU: 0 UID: 0 PID: 4235 Comm: bash Kdump: loaded Not tainted 6.17.0-rc7-next-20250923ce7f1a983b07 #1 PREEMPT(voluntary)
> [2800084354.542901] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
> [12688583143.270684] RIP: 0010:early_set_pages_state+0x0/0x120
Given that a lore search on early_set_pages_state lights up Ard's series[*] to
cleanup the boot code for SEV, and that said series is new in next-20250908 (NOT
in next-20250905), that seems like a likely culprit.
[*] https://lore.kernel.org/all/20250828102202.1849035-24-ardb+git@google.com
> [15541331571.597940] Code: 02 02 02 02 02 00 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 02 02 02 00 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 <02> 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 02 02 02 02 02 02
> [12688583143.270684] RSP: 0018:ffffb608807a7be0 EFLAGS: 00010006
> [1373710140.379273] RAX: ffff9ed0bfe53000 RBX: ffffffff9abecbe8 RCX: ffffb608807a7be8
> [2800084354.542901] RDX: 0000000000000001 RSI: 000000007fe53000 RDI: ffff9ed03fe53000
> [1373710140.379273] RBP: 0000000000000001 R08: 0000000000000001 R09: ffff9ed03fe53000
> [12688583143.270684] R10: 000000000f001000 R11: 0000000000000000 R12: ffff9ed03fe53000
> [15541331571.597940] R13: 0000000000000000 R14: ffff9ecfcf00a298 R15: 0000000000001000
> [11262208929.107056] FS: 00007f4cb7f05740(0000) GS:ffff9ed0a282c000(0000) knlGS:0000000000000000
> [2800084354.542901] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [18394079999.925196] CR2: ffffffff9a91e731 CR3: 000800000fb1c000 CR4: 00000000003506f0
> [12688583143.270684] Call Trace:
> [18394079999.925196] <TASK>
> [2800084354.542901] set_pages_state.part.0+0x63/0xa0
> [2800084354.542901] snp_kexec_finish+0x432/0x490
> [12688583143.270684] native_machine_crash_shutdown+0x65/0x90
> [15541331571.597940] __crash_kexec+0x56/0x120
> [1373710140.379273] ? __crash_kexec+0x104/0x120
> [12688583143.270684] ? vpanic+0x2a2/0x360
> [18394079999.925196] ? panic+0x52/0x60
> [11262208929.107056] ? sysrq_handle_crash+0x11/0x20
> [16967705785.761568] ? __handle_sysrq+0xb6/0x170
> [1373710140.379273] ? write_sysrq_trigger+0x50/0x70
> [1373710140.379273] ? proc_reg_write+0x50/0x90
> [18394079999.925196] ? preempt_count_add+0x42/0xa0
> [2800084354.542901] ? vfs_write+0xf4/0x430
> [11262208929.107056] ? handle_mm_fault+0xd0/0x200
> [18394079999.925196] ? ksys_write+0x5c/0xd0
> [12688583143.270684] ? do_syscall_64+0x4c/0x200
> [11262208929.107056] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [15541331571.597940] </TASK>
> [12688583143.270684] Modules linked in: efivarfs
> [2800084354.542901] CR2: ffffffff9a91e731
> [14114957357.434312] ---[ end trace 0000000000000000 ]---
> [11262208929.107056] RIP: 0010:early_set_pages_state+0x0/0x120
> [12688583143.270684] Code: 02 02 02 02 02 00 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 02 02 02 00 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 <02> 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 02 02 02 02 02 02
> [15541331571.597940] RSP: 0018:ffffb608807a7be0 EFLAGS: 00010006
> [14114957357.434312] RAX: ffff9ed0bfe53000 RBX: ffffffff9abecbe8 RCX: ffffb608807a7be8
> [2800084354.542901] RDX: 0000000000000001 RSI: 000000007fe53000 RDI: ffff9ed03fe53000
> [15541331571.597940] RBP: 0000000000000001 R08: 0000000000000001 R09: ffff9ed03fe53000
> [2800084354.542901] R10: 000000000f001000 R11: 0000000000000000 R12: ffff9ed03fe53000
> [2800084354.542901] R13: 0000000000000000 R14: ffff9ecfcf00a298 R15: 0000000000001000
> [2800084354.542901] FS: 00007f4cb7f05740(0000) GS:ffff9ed0a282c000(0000) knlGS:0000000000000000
> [14114957357.434312] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [11262208929.107056] CR2: ffffffff9a91e731 CR3: 000800000fb1c000 CR4: 00000000003506f0
> [12688583143.270684] Kernel panic - not syncing: Fatal exception
Powered by blists - more mailing lists