[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5e17a35d-2a94-f482-c466-521afcab80b8@linux.ibm.com>
Date: Thu, 5 Jan 2023 19:16:37 -0500
From: Matthew Rosato <mjrosato@...ux.ibm.com>
To: Jason Gunthorpe <jgg@...dia.com>,
Alex Williamson <alex.williamson@...hat.com>
Cc: cohuck@...hat.com, borntraeger@...ux.ibm.com,
jjherne@...ux.ibm.com, akrowiak@...ux.ibm.com, pasic@...ux.ibm.com,
zhenyuw@...ux.intel.com, zhi.a.wang@...el.com, hch@...radead.org,
intel-gfx@...ts.freedesktop.org,
intel-gvt-dev@...ts.freedesktop.org, linux-s390@...r.kernel.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Kevin Tian <kevin.tian@...el.com>,
Christoph Hellwig <hch@....de>
Subject: Re: [PATCH v3 1/1] vfio: remove VFIO_GROUP_NOTIFY_SET_KVM
On 1/5/23 6:34 PM, Jason Gunthorpe wrote:
> On Thu, Jan 05, 2023 at 03:09:30PM -0700, Alex Williamson wrote:
>> On Thu, 19 May 2022 14:33:11 -0400
>> Matthew Rosato <mjrosato@...ux.ibm.com> wrote:
>>
>>> Rather than relying on a notifier for associating the KVM with
>>> the group, let's assume that the association has already been
>>> made prior to device_open. The first time a device is opened
>>> associate the group KVM with the device.
>>>
>>> This fixes a user-triggerable oops in GVT.
>>
>> It seems this has traded an oops for a deadlock, which still exists
>> today in both GVT-g and vfio-ap. These are the only vfio drivers that
>> care about kvm, so they make use of kvm_{get,put}_kvm(), where the
vfio-pci-zdev also
>> latter is called by their .close_device() callbacks.
Huh, I've never seen this deadlock with vfio-pci-zdev or vfio-ap, but I see what you're saying... I guess it's not seen under typical circumstances with QEMU because kvm_vfio_group_del would have already been triggered via KVM_DEV_VFIO_GROUP_DEL by the time we close the device, such that the group would not be found during the kvm_vfio_destroy call? (I'm not at all suggesting that we should rely on userspace behaving in this order, just wondering why I never saw it while testing)
>
> Bleck
>
> It is pretty common to run the final part of 'put' from a workqueue
> specifically to avoid stuff like this, eg fput does it
>
> Maybe that is the simplest?
Yeah, this is also what I was thinking, replace the direct kvm_put_kvm calls with, say, schedule_delayed_work in each driver, where the delayed task just does the kvm_put_kvm (along with a brief comment explaining why we handle the put asynchronously).
Other than that.. The goal of this patch originally was to get the kvm reference at first open_device and release it with the very last close_device, so the only other option I could think of would be to take the responsibility back from the vfio drivers and do the kvm_get_kvm and kvm_put_kvm directly in vfio_main after dropping the (but that would result in some ugly symbol linkage and would acquire kvm references that a driver maybe does not care about so I don't really like that idea)
Powered by blists - more mailing lists