[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YxI7kzuchcJz8sRX@nvidia.com>
Date: Fri, 2 Sep 2022 14:21:23 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Matthew Rosato <mjrosato@...ux.ibm.com>
Cc: Robin Murphy <robin.murphy@....com>, iommu@...ts.linux.dev,
Alex Williamson <alex.williamson@...hat.com>,
linux-s390@...r.kernel.org, schnelle@...ux.ibm.com,
pmorel@...ux.ibm.com, borntraeger@...ux.ibm.com, hca@...ux.ibm.com,
gor@...ux.ibm.com, gerald.schaefer@...ux.ibm.com,
agordeev@...ux.ibm.com, svens@...ux.ibm.com, joro@...tes.org,
will@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/2] iommu/s390: Fix race with release_device ops
On Fri, Sep 02, 2022 at 01:11:09PM -0400, Matthew Rosato wrote:
> On 9/1/22 4:37 PM, Jason Gunthorpe wrote:
> > On Thu, Sep 01, 2022 at 12:14:24PM -0400, Matthew Rosato wrote:
> >> On 9/1/22 6:25 AM, Robin Murphy wrote:
> >>> On 2022-08-31 21:12, Matthew Rosato wrote:
> >>>> With commit fa7e9ecc5e1c ("iommu/s390: Tolerate repeat attach_dev
> >>>> calls") s390-iommu is supposed to handle dynamic switching between IOMMU
> >>>> domains and the DMA API handling. However, this commit does not
> >>>> sufficiently handle the case where the device is released via a call
> >>>> to the release_device op as it may occur at the same time as an opposing
> >>>> attach_dev or detach_dev since the group mutex is not held over
> >>>> release_device. This was observed if the device is deconfigured during a
> >>>> small window during vfio-pci initialization and can result in WARNs and
> >>>> potential kernel panics.
> >>>
> >>> Hmm, the more I think about it, something doesn't sit right about this whole situation... release_device is called via the notifier from device_del() after the device has been removed from its parent bus and largely dismantled; it should definitely not still have a driver bound by that point, so how is VFIO doing things that manage to race at all?
> >>>
> >>> Robin.
> >>
> >> So, I generally have seen the issue manifest as one of the calls
> >> into the iommu core from __vfio_group_unset_container
> >> (e.g. iommu_deatch_group via vfio_type1_iommu) failing with a WARN.
> >> This happens when the vfio group fd is released, which could be
> >> coming e.g. from a userspace ioctl VFIO_GROUP_UNSET_CONTAINER.
> >> AFAICT there's nothing serializing the notion of calling into the
> >> iommu core here against a device that is simultaneously going
> >> through release_device (because we don't enter release_device with
> >> the group mutex held), resulting in unpredictable behavior between
> >> the dueling attach_dev/detach_dev and the release_device for
> >> s390-iommu at least.
> >
> > Oh, this is a vfio bug.
>
> I've been running with your diff applied today on s390 and this
> indeed fixes the issue by preventing the detach-after-release coming
> out of vfio.
Heh, I'm shocked it worked at all
I've been trying to understand Robin's latest remarks because maybe I
don't really understand your situation right.
IMHO this is definately a VFIO bug, because in a single-device group
we must not allow the domain to remain attached past remove(). Or more
broadly we shouldn't be holding ownership of a group without also
having a driver attached.
But this dicussion with Robin about multi-device groups and hotplug
makes me wonder what your situation is? There is certainly something
interesting there too, and this can't be a solution to that problem.
> Can you send as a patch for review?
After I wrote this I had a better idea, to avoid the completion and
just fully orphan the group fd.
And the patch is kind of messy
Can you forward me the backtrace you hit also?
(Though I'm not sure I can get to this promptly, I have only 4 working
days before LPC and still many things to do)
Thanks,
Jason
Powered by blists - more mailing lists