[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQDhb7HahKWLuUG4@devgpu012.nha5.facebook.com>
Date: Tue, 28 Oct 2025 08:29:51 -0700
From: Alex Mastro <amastro@...com>
To: Alex Williamson <alex@...zbot.org>
CC: Alejandro Jimenez <alejandro.j.jimenez@...cle.com>,
Jason Gunthorpe
<jgg@...pe.ca>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, David
Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v4 0/3] vfio: handle DMA map/unmap up to the addressable
limit
On Mon, Oct 27, 2025 at 07:57:32PM -0600, Alex Williamson wrote:
> On Mon, 27 Oct 2025 09:02:55 -0700
> Alex Mastro <amastro@...com> wrote:
>
> > On Tue, Oct 21, 2025 at 09:25:55AM -0700, Alex Mastro wrote:
> > > On Mon, Oct 20, 2025 at 03:36:33PM -0600, Alex Williamson wrote:
> > > > Along with the tag, it would probably be useful in that same commit to
> > > > expand on the scope of the issue in the commit log. I believe we allow
> > > > mappings to be created at the top of the address space that cannot be
> > > > removed via ioctl, but such inconsistency should result in an
> > > > application error due to the failed ioctl and does not affect cleanup
> > > > on release.
> >
> > I want to make sure I understand the cleanup on release path. Is my supposition
> > below correct?
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > index 916cad80941c..7f8d23b06680 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -1127,6 +1127,7 @@ static size_t unmap_unpin_slow(struct vfio_domain *domain,
> > static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > bool do_accounting)
> > {
> > + // end == 0 due to overflow
> > dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
> > struct vfio_domain *domain, *d;
> > LIST_HEAD(unmapped_region_list);
> > @@ -1156,6 +1157,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > }
> >
> > iommu_iotlb_gather_init(&iotlb_gather);
> > + // doesn't enter the loop, never calls iommu_unmap
>
> If it were only that, I think the iommu_domain_free() would be
> sufficient, but it looks like we're also missing the unpin. Freeing
Oh, right.
> the IOMMU domain isn't going to resolve that. So it actually appears
> we're leaking those pinned pages and this isn't as self-resolving as I
> had thought. I imagine if you ran your new unit test to the point where
> we'd pinned and failed to unpin the majority of memory you'd start to
> see system-wide problems. Thanks,
Makes sense.
> Alex
>
> > while (iova < end) {
> > size_t unmapped, len;
> > phys_addr_t phys, next;
> > @@ -1210,6 +1212,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
> > {
> > WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list));
> > + // go here
> > vfio_unmap_unpin(iommu, dma, true);
> > vfio_unlink_dma(iommu, dma);
> > put_task_struct(dma->task);
> > @@ -2394,6 +2397,8 @@ static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
> > struct rb_node *node;
> >
> > while ((node = rb_first(&iommu->dma_list)))
> > + // eventually, we attempt to remove the end of address space
> > + // mapping
> > vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
> > }
> >
> > @@ -2628,6 +2633,8 @@ static void vfio_release_domain(struct vfio_domain *domain)
> > kfree(group);
> > }
> >
> > + // Is this backstop what saves us? Even though we didn't do individual
> > + // unmaps, the "leaked" end of address space mappings get freed here?
> > iommu_domain_free(domain->domain);
> > }
> >
> > @@ -2643,10 +2650,12 @@ static void vfio_iommu_type1_release(void *iommu_data)
> > kfree(group);
> > }
> >
> > + // start here
> > vfio_iommu_unmap_unpin_all(iommu);
> >
> > list_for_each_entry_safe(domain, domain_tmp,
> > &iommu->domain_list, next) {
> > + // eventually...
> > vfio_release_domain(domain);
> > list_del(&domain->next);
> > kfree(domain);
>
Powered by blists - more mailing lists