[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190510001212.3e2bf5ea.pasic@linux.ibm.com>
Date: Fri, 10 May 2019 00:12:12 +0200
From: Halil Pasic <pasic@...ux.ibm.com>
To: Pierre Morel <pmorel@...ux.ibm.com>
Cc: Cornelia Huck <cohuck@...hat.com>,
Parav Pandit <parav@...lanox.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"cjia@...dia.com" <cjia@...dia.com>,
Tony Krowiak <akrowiak@...ux.ibm.com>
Subject: Re: [PATCHv2 08/10] vfio/mdev: Improve the create/remove sequence
On Thu, 9 May 2019 18:26:59 +0200
Pierre Morel <pmorel@...ux.ibm.com> wrote:
> On 09/05/2019 11:06, Cornelia Huck wrote:
> > [vfio-ap folks: find a question regarding removal further down]
> >
> > On Wed, 8 May 2019 22:06:48 +0000
> > Parav Pandit <parav@...lanox.com> wrote:
> >
> >>> -----Original Message-----
> >>> From: Cornelia Huck <cohuck@...hat.com>
> >>> Sent: Wednesday, May 8, 2019 12:10 PM
> >>> To: Parav Pandit <parav@...lanox.com>
> >>> Cc: kvm@...r.kernel.org; linux-kernel@...r.kernel.org;
> >>> kwankhede@...dia.com; alex.williamson@...hat.com; cjia@...dia.com
> >>> Subject: Re: [PATCHv2 08/10] vfio/mdev: Improve the create/remove
> >>> sequence
> >>>
> >>> On Tue, 30 Apr 2019 17:49:35 -0500
> >>> Parav Pandit <parav@...lanox.com> wrote:
> >>>
>
> ...snip...
>
> >>>> @@ -373,16 +330,15 @@ int mdev_device_remove(struct device *dev,
> >>> bool force_remove)
> >>>> mutex_unlock(&mdev_list_lock);
> >>>>
> >>>> type = to_mdev_type(mdev->type_kobj);
> >>>> + mdev_remove_sysfs_files(dev, type);
> >>>> + device_del(&mdev->dev);
> >>>> parent = mdev->parent;
> >>>> + ret = parent->ops->remove(mdev);
> >>>> + if (ret)
> >>>> + dev_err(&mdev->dev, "Remove failed: err=%d\n", ret);
> >>>
> >>> I think carrying on with removal regardless of the return code of the
> >>> ->remove callback makes sense, as it simply matches usual practice.
> >>> However, are we sure that every vendor driver works well with that? I think
> >>> it should, as removal from bus unregistration (vs. from the sysfs
> >>> file) was always something it could not veto, but have you looked at the
> >>> individual drivers?
> >>>
> >> I looked at following drivers a little while back.
> >> Looked again now.
> >>
> >> drivers/gpu/drm/i915/gvt/kvmgt.c which clears the handle valid in intel_vgpu_release(), which should finish first before remove() is invoked.
> >>
> >> s390 vfio_ccw_mdev_remove() driver drivers/s390/cio/vfio_ccw_ops.c remove() always returns 0.
> >> s39 crypo fails the remove() once vfio_ap_mdev_release marks kvm null, which should finish before remove() is invoked.
> >
> > That one is giving me a bit of a headache (the ->kvm reference is
> > supposed to keep us from detaching while a vm is running), so let's cc:
> > the vfio-ap maintainers to see whether they have any concerns.
> >
>
> We are aware of this race and we did correct this in the IRQ patches for
> which it would have become a real issue.
> We now increment/decrement the KVM reference counter inside open and
> release.
> Should be right after this.
>
Tony, what is your take on this? I don't have the bandwidth to think
this through properly, but my intuition tells me: this might be more
complicated than what Pierre's response suggests.
Regards,
Halil
Powered by blists - more mailing lists