[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251027195732.2b7d1d3f@shazbot.org>
Date: Mon, 27 Oct 2025 19:57:32 -0600
From: Alex Williamson <alex@...zbot.org>
To: Alex Mastro <amastro@...com>
Cc: Alejandro Jimenez <alejandro.j.jimenez@...cle.com>, Jason Gunthorpe
<jgg@...pe.ca>, <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v4 0/3] vfio: handle DMA map/unmap up to the addressable
limit
On Mon, 27 Oct 2025 09:02:55 -0700
Alex Mastro <amastro@...com> wrote:
> On Tue, Oct 21, 2025 at 09:25:55AM -0700, Alex Mastro wrote:
> > On Mon, Oct 20, 2025 at 03:36:33PM -0600, Alex Williamson wrote:
> > > Along with the tag, it would probably be useful in that same commit to
> > > expand on the scope of the issue in the commit log. I believe we allow
> > > mappings to be created at the top of the address space that cannot be
> > > removed via ioctl, but such inconsistency should result in an
> > > application error due to the failed ioctl and does not affect cleanup
> > > on release.
>
> I want to make sure I understand the cleanup on release path. Is my supposition
> below correct?
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 916cad80941c..7f8d23b06680 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1127,6 +1127,7 @@ static size_t unmap_unpin_slow(struct vfio_domain *domain,
> static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> bool do_accounting)
> {
> + // end == 0 due to overflow
> dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
> struct vfio_domain *domain, *d;
> LIST_HEAD(unmapped_region_list);
> @@ -1156,6 +1157,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> }
>
> iommu_iotlb_gather_init(&iotlb_gather);
> + // doesn't enter the loop, never calls iommu_unmap
If it were only that, I think the iommu_domain_free() would be
sufficient, but it looks like we're also missing the unpin. Freeing
the IOMMU domain isn't going to resolve that. So it actually appears
we're leaking those pinned pages and this isn't as self-resolving as I
had thought. I imagine if you ran your new unit test to the point where
we'd pinned and failed to unpin the majority of memory you'd start to
see system-wide problems. Thanks,
Alex
> while (iova < end) {
> size_t unmapped, len;
> phys_addr_t phys, next;
> @@ -1210,6 +1212,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
> {
> WARN_ON(!RB_EMPTY_ROOT(&dma->pfn_list));
> + // go here
> vfio_unmap_unpin(iommu, dma, true);
> vfio_unlink_dma(iommu, dma);
> put_task_struct(dma->task);
> @@ -2394,6 +2397,8 @@ static void vfio_iommu_unmap_unpin_all(struct vfio_iommu *iommu)
> struct rb_node *node;
>
> while ((node = rb_first(&iommu->dma_list)))
> + // eventually, we attempt to remove the end of address space
> + // mapping
> vfio_remove_dma(iommu, rb_entry(node, struct vfio_dma, node));
> }
>
> @@ -2628,6 +2633,8 @@ static void vfio_release_domain(struct vfio_domain *domain)
> kfree(group);
> }
>
> + // Is this backstop what saves us? Even though we didn't do individual
> + // unmaps, the "leaked" end of address space mappings get freed here?
> iommu_domain_free(domain->domain);
> }
>
> @@ -2643,10 +2650,12 @@ static void vfio_iommu_type1_release(void *iommu_data)
> kfree(group);
> }
>
> + // start here
> vfio_iommu_unmap_unpin_all(iommu);
>
> list_for_each_entry_safe(domain, domain_tmp,
> &iommu->domain_list, next) {
> + // eventually...
> vfio_release_domain(domain);
> list_del(&domain->next);
> kfree(domain);
Powered by blists - more mailing lists