[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68d86871-12f2-4de1-81aa-dbc9e12b6f91@amd.com>
Date: Mon, 12 Jan 2026 20:47:50 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>, Yazen Ghannam <yazen.ghannam@....com>
Cc: "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
"open list:AMD NODE DRIVER" <linux-kernel@...r.kernel.org>,
regressions@...ts.linux.dev
Subject: Re: kernel NULL pointer dereference in
quirk_clear_strap_no_soft_reset_dev2_f0 -> amd_smn_read
On 1/12/2026 7:01 PM, Marek Marczykowski-Górecki wrote:
> Hi,
>
> I've got a report that kernel 6.17.9 crashes when running a Xen HVM domU
> with AMD Raphael/Granite Ridge USB controller passed through.
> It worked correctly in 6.12.59. Between those versions, I don't see any
> relevant change to quirk_clear_strap_no_soft_reset_dev2_f0() function,
> but the AMD node driver did got some changes, so my guess is one of them
> is to blame. I know the good-bad range is huge, but there aren't that
> many changes to the AMD node driver in this range.
Is this perhaps a case that only the USB controller was passed through
but that the root controller wasn't? That would lead to a case that
amd_smn_init() was never called and thus amd_roots was not initialized
properly.
So it would be a NULL pointer deref. If that's correct, something like
this should work to avoid it.
diff --git a/arch/x86/kernel/amd_node.c b/arch/x86/kernel/amd_node.c
index 3d0a4768d603c..894823b444d47 100644
--- a/arch/x86/kernel/amd_node.c
+++ b/arch/x86/kernel/amd_node.c
@@ -91,6 +91,11 @@ static int __amd_smn_rw(u8 i_off, u8 d_off, u16 node,
u32 address, u32 *value, b
if (node >= amd_num_nodes())
return err;
+ if (!amd_roots) {
+ pr_warn("AMD SMN roots not initialized.\n");
+ return err;
+ }
+
root = amd_roots[node];
if (!root)
return err;
>
> It's running on Qubes OS 4.3, which uses Xen 4.19, and does PCI
> passthrough of USB controllers to a dedicated VM (HVM).
>
> The full crash message is:
>
> [ 0.302571] pci 0000:00:08.0: quirk_usb_early_handoff+0x0/0x180 took 16590 usecs
> [ 0.303172] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 0.303189] #PF: supervisor read access in kernel mode
> [ 0.303202] #PF: error_code(0x0000) - not-present page
> [ 0.303216] PGD 0 P4D 0
> [ 0.303225] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 0.303236] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.9-1.qubes.fc41.x86_64 #1 PREEMPT(full)
> [ 0.303258] Hardware name: Xen HVM domU, BIOS 4.19.3 08/26/2025
> [ 0.303273] RIP: 0010:__amd_smn_rw+0x30/0x100
> [ 0.303288] Code: 05 bd 44 b8 01 66 0f af 05 2d 44 b8 01 41 57 41 56 41 55 41 54 55 53 66 39 c2 0f 83 c0 00 00 00 48 8b 05 c3 61 d7 02 0f b7 d2 <4c> 8b 34 d0 4d 85 f6 0f 84 a9 00 00 00 80 3d a4 61 d7 02 00 0f 84
> [ 0.303327] RSP: 0018:ffffcdd30001fd68 EFLAGS: 00010297
> [ 0.303341] RAX: 0000000000000000 RBX: ffffcdd30001fdb4 RCX: 0000000010136008
> [ 0.303359] RDX: 0000000000000000 RSI: 0000000000000064 RDI: 0000000000000060
> [ 0.303377] RBP: ffffffffa684bb80 R08: ffffcdd30001fdb4 R09: 0000000000000000
> [ 0.303395] R10: ffffffffa7567420 R11: 0000000000000020 R12: ffff8dd081dff000
> [ 0.303413] R13: ffffffffa736ab60 R14: 00000000055ee14a R15: ffff8dd081dff000
> [ 0.303434] FS: 0000000000000000(0000) GS:ffff8dd0e87c1000(0000) knlGS:0000000000000000
> [ 0.303452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.303468] CR2: 0000000000000000 CR3: 000000000c62c000 CR4: 0000000000750ef0
> [ 0.303487] PKRU: 55555554
> [ 0.303495] Call Trace:
> [ 0.303504] <TASK>
> [ 0.303513] ? __pfx_quirk_clear_strap_no_soft_reset_dev2_f0+0x10/0x10
> [ 0.304112] amd_smn_read+0x27/0x50
> [ 0.304112] quirk_clear_strap_no_soft_reset_dev2_f0+0x37/0x80
> [ 0.304112] pci_fixup_device+0xf6/0x1b0
> [ 0.304112] pci_apply_final_quirks+0xe9/0x280
> [ 0.304112] ? __pfx_pci_apply_final_quirks+0x10/0x10
> [ 0.304112] do_one_initcall+0x57/0x310
> [ 0.304112] do_initcalls+0x1ef/0x240
> [ 0.304112] kernel_init_freeable+0x187/0x210
> [ 0.304112] ? __pfx_kernel_init+0x10/0x10
> [ 0.304112] kernel_init+0x1a/0x140
> [ 0.304112] ret_from_fork+0xf2/0x110
> [ 0.304112] ? __pfx_kernel_init+0x10/0x10
> [ 0.304112] ret_from_fork_asm+0x1a/0x30
> [ 0.304112] </TASK>
> [ 0.304112] Modules linked in:
> [ 0.304112] CR2: 0000000000000000
> [ 0.304112] ---[ end trace 0000000000000000 ]---
> [ 0.304112] RIP: 0010:__amd_smn_rw+0x30/0x100
> [ 0.304112] Code: 05 bd 44 b8 01 66 0f af 05 2d 44 b8 01 41 57 41 56 41 55 41 54 55 53 66 39 c2 0f 83 c0 00 00 00 48 8b 05 c3 61 d7 02 0f b7 d2 <4c> 8b 34 d0 4d 85 f6 0f 84 a9 00 00 00 80 3d a4 61 d7 02 00 0f 84
> [ 0.304112] RSP: 0018:ffffcdd30001fd68 EFLAGS: 00010297
> [ 0.304112] RAX: 0000000000000000 RBX: ffffcdd30001fdb4 RCX: 0000000010136008
> [ 0.304112] RDX: 0000000000000000 RSI: 0000000000000064 RDI: 0000000000000060
> [ 0.304112] RBP: ffffffffa684bb80 R08: ffffcdd30001fdb4 R09: 0000000000000000
> [ 0.304112] R10: ffffffffa7567420 R11: 0000000000000020 R12: ffff8dd081dff000
> [ 0.304112] R13: ffffffffa736ab60 R14: 00000000055ee14a R15: ffff8dd081dff000
> [ 0.304112] FS: 0000000000000000(0000) GS:ffff8dd0e87c1000(0000) knlGS:0000000000000000
> [ 0.304112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.304112] CR2: 0000000000000000 CR3: 000000000c62c000 CR4: 0000000000750ef0
> [ 0.304112] PKRU: 55555554
> [ 0.304112] Kernel panic - not syncing: Fatal exception
>
> The device, as seen from within the VM:
>
> 00:09.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] (prog-if 30 [XHCI])
> Subsystem: ASUSTeK Computer Inc. Device [1043:8877]
> Physical Slot: 9
> Flags: bus master, fast devsel, latency 0, IRQ 21
> Memory at f2200000 (64-bit, non-prefetchable) [size=1M]
> Capabilities: [48] Vendor Specific Information: Len=08 <?>
> Capabilities: [50] Power Management version 3
> Capabilities: [64] Express Endpoint, IntMsgNum 0
> Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
> Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
> Kernel driver in use: xhci_hcd
> Kernel modules: xhci_pci
> 00: 22 10 b8 15 07 04 10 00 00 30 03 0c 10 00 00 00
> 10: 04 00 20 f2 00 00 00 00 00 00 00 00 00 00 00 00
> 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 77 88
> 30: 00 00 00 00 48 00 00 00 00 00 00 00 2e 01 00 00
> 40: 00 00 00 00 00 00 00 00 09 50 08 00 43 10 77 88
> 50: 01 64 03 00 08 00 00 00 00 00 00 00 00 00 00 00
> 60: 31 60 00 00 10 a0 02 00 a1 8f 00 00 30 29 00 00
> 70: 04 0d 40 00 00 00 04 11 00 00 00 00 00 00 00 00
> 80: 00 00 00 00 00 00 00 00 1f 00 01 00 00 00 00 00
> 90: 1e 00 80 01 04 00 1f 00 00 00 00 00 00 00 00 00
> a0: 05 c0 80 00 00 00 00 00 00 00 00 00 00 00 00 00
> b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> c0: 11 00 07 80 00 e0 0f 00 00 f0 0f 00 00 00 00 00
> d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> Any ideas?
>
> Original report at (with full kernel log etc): https://forum.qubes-os.org/t/yet-another-usb-keyboard-thread/38355/8
>
> #regzbot introduced: v6.12.59..v6.17.9
>
Powered by blists - more mailing lists