[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YxdMwoC+58NvydY3@nvidia.com>
Date: Tue, 6 Sep 2022 10:36:02 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Matthew Rosato <mjrosato@...ux.ibm.com>, iommu@...ts.linux.dev,
Alex Williamson <alex.williamson@...hat.com>,
linux-s390@...r.kernel.org, schnelle@...ux.ibm.com,
pmorel@...ux.ibm.com, borntraeger@...ux.ibm.com, hca@...ux.ibm.com,
gor@...ux.ibm.com, gerald.schaefer@...ux.ibm.com,
agordeev@...ux.ibm.com, svens@...ux.ibm.com, joro@...tes.org,
will@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/2] iommu/s390: Fix race with release_device ops
On Mon, Sep 05, 2022 at 10:46:44AM +0100, Robin Murphy wrote:
> > I've been trying to understand Robin's latest remarks because maybe I
> > don't really understand your situation right.
>
> That was really just me thinking out loud to guess at how it must be
> happening - I wasn't sure whether VFIO is actually intended to allow that or
> not, so if not then by all means let's look at fixing that, but as I say I
> think we're only seeing it provoke a problem at the driver level because of
> 9ac8545199a1, and fixing VFIO doesn't fix that in general. And conversely if
> we *can* fix that properly at the IOMMU API level then the current VFIO
> behaviour should become benign again anyway.
Okay, so there are probably other problems here that highlighted
this..
> > IMHO this is definately a VFIO bug, because in a single-device group
> > we must not allow the domain to remain attached past remove(). Or more
> > broadly we shouldn't be holding ownership of a group without also
> > having a driver attached.
>
> FWIW I was assuming it might be fine for a VFIO user to hold the group open
> if they expect the device to come back again and re-bind (for example,
> perhaps over some reconfiguration that requires turning SR-IOV off and on
> again?)
Once all the devices in the group are removed then something like
pci_device_group() will have no way to discover the group again. eg
in the SRIOV case it will just fall right down to iommu_group_alloc(),
and that gives a new struct iommu_group and new IDR allocation.
So in the general case this doesn't happen, I don't think any VFIO
userspace should attempt to rely on it.
>From an API perspective is a much saner API toward iommu using drivers
like VFIO if those drivers only use the iommu api while they have a
device driver attached.
Regards,
Jason
Powered by blists - more mailing lists