[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f4dfc03-bdfc-41d1-8c5a-1e767e472a96@crc.id.au>
Date: Fri, 29 Dec 2023 16:46:50 +1100
From: Steven Haigh <netwiz@....id.au>
To: Lukas Wunner <lukas@...ner.de>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
f.ebner@...xmox.com
Subject: Re: Qemu KVM thread spins at 100% CPU usage on scsi hot-unplug
(kernel 6.6.8 guest)
On 29/12/23 00:18, Lukas Wunner wrote:
> On Thu, Dec 28, 2023 at 01:03:10PM +1100, Steven Haigh wrote:
>> At some point in kernel 6.6.x, SCSI hotplug in qemu VMs broke. This was
>> mostly fixed in the following commit to release 6.6.8:
>> commit 5cc8d88a1b94b900fd74abda744c29ff5845430b
>> Author: Bjorn Helgaas <bhelgaas@...gle.com>
>> Date: Thu Dec 14 09:08:56 2023 -0600
>> Revert "PCI: acpiphp: Reassign resources on bridge if necessary"
>>
>> After this commit, the SCSI block device is hotplugged correctly, and a device node as /dev/sdX appears within the qemu VM.
>>
>> New problem:
>>
>> When the same SCSI block device is hot-unplugged, the QEMU KVM process will
>> spin at 100% CPU usage. The guest shows no CPU being used via top, but the
>> host will continue to spin in the KVM thread until the VM is rebooted.
>
> Find out the PID of the qemu process on the host, then cat /proc/$PID/stack
> to see where the CPU time is spent.
Thanks for the tip - I'll certainly do that.
Annoyingly, since I posted this report originally, then adding in a new report to the kernel.org lists in this, I have
been unable to reproduce this problem. I have successfully done ~22 scsi hotplug / remove cycles and none resulted in
reproducing the issue.
Kernel versions are still the same on both proxmox host and the Fedora guest - however I see an update on the host of
the qemu-kvm packages in Proxmox. The proxmox host hasn't even been rebooted in this time.
I wonder if the initial revert included in 6.6.8 fixed the main problem, and the later update to qemu-kvm packages on
the proxmox host followed by the last reboot of the VM with the new KVM package sorted the second issue.
Seeing as I can no longer reproduce this reliably - whereas it was 100% reproducible prior, maybe I'm now chasing ghosts.
I'll still continue to monitor - as I normally do this SCSI hotplug ~3 times per week doing backups to different
external HDDs - so if I do observe it again, I'll grab the stack and reply to this thread again with what I can find.
Until then, I don't want to waste other peoples time also chasing ghosts :)
--
Steven Haigh
📧 netwiz@....id.au
💻 https://crc.id.au
Powered by blists - more mailing lists