linux-kernel - Re: [PATCH v3 3/3] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200523170602.5eb09a66@x1.home>
Date:   Sat, 23 May 2020 17:06:02 -0600
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        cohuck@...hat.com, jgg@...pe.ca, cai@....pw
Subject: Re: [PATCH v3 3/3] vfio-pci: Invalidate mmaps and block MMIO access
 on disabled memory

On Sat, 23 May 2020 15:34:17 -0400
Peter Xu <peterx@...hat.com> wrote:

> Hi, Alex,
> 
> On Fri, May 22, 2020 at 01:17:43PM -0600, Alex Williamson wrote:
> > @@ -1346,15 +1526,32 @@ static vm_fault_t vfio_pci_mmap_fault(struct vm_fault *vmf)
> >  {
> >  	struct vm_area_struct *vma = vmf->vma;
> >  	struct vfio_pci_device *vdev = vma->vm_private_data;
> > +	vm_fault_t ret = VM_FAULT_NOPAGE;
> > +
> > +	mutex_lock(&vdev->vma_lock);
> > +	down_read(&vdev->memory_lock);  
> 
> I remembered to have seen the fault() handling FAULT_FLAG_RETRY_NOWAIT at least
> in the very first version, but it's not here any more...  Could I ask what's
> the reason behind?  I probably have missed something along with the versions,
> I'm just not sure whether e.g. this would potentially block a GUP caller even
> if it's with FOLL_NOWAIT.

This is largely what v2 was about, from the cover letter:

    Locking in 3/ is substantially changed to avoid the retry scenario
    within the fault handler, therefore a caller who does not allow
    retry will no longer receive a SIGBUS on contention.

The discussion thread starts here:

https://lore.kernel.org/kvm/20200501234849.GQ26002@ziepe.ca/

Feel free to interject if there's something that doesn't make sense,
the idea is that since we've fixed the lock ordering we never need to
release one lock to wait for another, therefore we can wait for the
lock.  I'm under the impression that we can wait for the lock
regardless of the flags under these conditions.

> Side note: Another thing I thought about when reading this patch - there seems
> to have quite some possibility that the VFIO_DEVICE_PCI_HOT_RESET ioctl will
> start to return -EBUSY now.  Not a problem for this series, but maybe we should
> rememeber to let the userspace handle -EBUSY properly as follow up too, since I
> saw QEMU seemed to not handle -EBUSY for host reset path right now.

I think this has always been the case, whether it's the device lock or
this lock, the only way I know to avoid potential deadlock is to use
the 'try' locking semantics.  In normal scenarios I expect access to
sibling devices is quiesced at this point, so a user driver actually
wanting to achieve a reset shouldn't be affected by this.  Thanks,

Alex