[<prev] [next>] [day] [month] [year] [list]
Message-ID: <b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net>
Date: Tue, 3 Feb 2026 17:27:00 -0500 (EST)
From: Cal Peake <cp@...olutedigital.net>
To: Kernel Mailing List <linux-kernel@...r.kernel.org>
cc: Mario Limonciello <superm1@...nel.org>,
Kent Russell <kent.russell@....com>,
Alex Deucher <alexander.deucher@....com>
Subject: amdgpu driver rebinding broken by "drm/amd: Clean up kfd node on
surprise disconnect"
Hi,
The recent commit 28695ca09d32: "drm/amd: Clean up kfd node on surprise
disconnect," has broken something with my VMs utilizing PCI passthrough.
Before launching the VM, I unbind a secondary Radeon GPU from the amdgpu
driver and then to the vfio-pci driver.
Previously, everything would just work and I'd get the following kernel
output:
amdgpu 0000:14:00.0: amdgpu: amdgpu: finishing device.
[drm] amdgpu: ttm finalized
vfio-pci 0000:14:00.0: vgaarb: VGA decodes changed: olddecodes=none,decodes=io+mem:owns=none
vfio-pci 0000:14:00.1: enabling device (0000 -> 0002)
vfio-pci 0000:14:00.3: enabling device (0000 -> 0002)
However, now when doing the rebinding: the host display (a different
Radeon GPU) will freeze up and after a short time, the system will reset
(thanks to a watchdog I believe) and I get this in the kernel log:
amdgpu 0000:14:00.0: amdgpu: amdgpu: finishing device.
vfio-pci 0000:14:00.1: Unable to change power state from D3hot to D0, device inaccessible
vfio-pci 0000:14:00.3: Unable to change power state from D3hot to D0, device inaccessible
vfio-pci 0000:14:00.2: Unable to change power state from D3hot to D0, device inaccessible
Indeed, backing out the commit from kernels 6.12.67, 6.12.68, and 6.18.8
gets things back to working.
Please let me know if you have any ideas or if there is anymore debugging
info I can provide.
Thanks,
--
Cal Peake
Powered by blists - more mailing lists