lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YxI7kzuchcJz8sRX@nvidia.com>
Date:   Fri, 2 Sep 2022 14:21:23 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Matthew Rosato <mjrosato@...ux.ibm.com>
Cc:     Robin Murphy <robin.murphy@....com>, iommu@...ts.linux.dev,
        Alex Williamson <alex.williamson@...hat.com>,
        linux-s390@...r.kernel.org, schnelle@...ux.ibm.com,
        pmorel@...ux.ibm.com, borntraeger@...ux.ibm.com, hca@...ux.ibm.com,
        gor@...ux.ibm.com, gerald.schaefer@...ux.ibm.com,
        agordeev@...ux.ibm.com, svens@...ux.ibm.com, joro@...tes.org,
        will@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 1/2] iommu/s390: Fix race with release_device ops

On Fri, Sep 02, 2022 at 01:11:09PM -0400, Matthew Rosato wrote:
> On 9/1/22 4:37 PM, Jason Gunthorpe wrote:
> > On Thu, Sep 01, 2022 at 12:14:24PM -0400, Matthew Rosato wrote:
> >> On 9/1/22 6:25 AM, Robin Murphy wrote:
> >>> On 2022-08-31 21:12, Matthew Rosato wrote:
> >>>> With commit fa7e9ecc5e1c ("iommu/s390: Tolerate repeat attach_dev
> >>>> calls") s390-iommu is supposed to handle dynamic switching between IOMMU
> >>>> domains and the DMA API handling.  However, this commit does not
> >>>> sufficiently handle the case where the device is released via a call
> >>>> to the release_device op as it may occur at the same time as an opposing
> >>>> attach_dev or detach_dev since the group mutex is not held over
> >>>> release_device.  This was observed if the device is deconfigured during a
> >>>> small window during vfio-pci initialization and can result in WARNs and
> >>>> potential kernel panics.
> >>>
> >>> Hmm, the more I think about it, something doesn't sit right about this whole situation... release_device is called via the notifier from device_del() after the device has been removed from its parent bus and largely dismantled; it should definitely not still have a driver bound by that point, so how is VFIO doing things that manage to race at all?
> >>>
> >>> Robin.
> >>
> >> So, I generally have seen the issue manifest as one of the calls
> >> into the iommu core from __vfio_group_unset_container
> >> (e.g. iommu_deatch_group via vfio_type1_iommu) failing with a WARN.
> >> This happens when the vfio group fd is released, which could be
> >> coming e.g. from a userspace ioctl VFIO_GROUP_UNSET_CONTAINER.
> >> AFAICT there's nothing serializing the notion of calling into the
> >> iommu core here against a device that is simultaneously going
> >> through release_device (because we don't enter release_device with
> >> the group mutex held), resulting in unpredictable behavior between
> >> the dueling attach_dev/detach_dev and the release_device for
> >> s390-iommu at least.
> > 
> > Oh, this is a vfio bug.
> 
> I've been running with your diff applied today on s390 and this
> indeed fixes the issue by preventing the detach-after-release coming
> out of vfio. 

Heh, I'm shocked it worked at all

I've been trying to understand Robin's latest remarks because maybe I
don't really understand your situation right.

IMHO this is definately a VFIO bug, because in a single-device group
we must not allow the domain to remain attached past remove(). Or more
broadly we shouldn't be holding ownership of a group without also
having a driver attached.

But this dicussion with Robin about multi-device groups and hotplug
makes me wonder what your situation is? There is certainly something
interesting there too, and this can't be a solution to that problem.

> Can you send as a patch for review?

After I wrote this I had a better idea, to avoid the completion and
just fully orphan the group fd.

And the patch is kind of messy

Can you forward me the backtrace you hit also?

(Though I'm not sure I can get to this promptly, I have only 4 working
days before LPC and still many things to do)

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ