[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251023163322.77c9848c@shazbot.org>
Date: Thu, 23 Oct 2025 16:33:22 -0600
From: Alex Williamson <alex@...zbot.org>
To: Alex Mastro <amastro@...com>
Cc: Alejandro Jimenez <alejandro.j.jimenez@...cle.com>, Jason Gunthorpe
<jgg@...pe.ca>, <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v4 0/3] vfio: handle DMA map/unmap up to the addressable
limit
On Thu, 23 Oct 2025 13:52:04 -0700
Alex Mastro <amastro@...com> wrote:
> One point to clarify
>
> > On Mon, Oct 20, 2025 at 03:36:33PM -0600, Alex Williamson wrote:
> > > Along with the tag, it would probably be useful in that same commit to
> > > expand on the scope of the issue in the commit log. I believe we allow
> > > mappings to be created at the top of the address space that cannot be
> > > removed via ioctl,
>
> True
>
> > > but such inconsistency should result in an application error due to the
> > > failed ioctl
>
> Actually, the ioctl does not fail in the sense that the caller gets an errno.
> Attempting to unmap a range ending at the end of address space succeeds (returns
> zero), but unmaps zero bytes. An application would only detect this if it
> explicitly checked the out size field of vfio_iommu_type1_dma_unmap. Or
> attempted to create another overlapping mapping on top of the ranges it expected
> to be unmapped.
Yes, the ioctl doesn't fail but the returned unmap size exposes the
inconsistency. This is because we don't require a 1:1 unmap, an unmap
can span multiple mappings, or none. Prior to TYPE1v2 we even allowed
splitting previous mappings. I agree that it obfuscates the
application detecting the error, but I'm not sure there's much we can
do about it given the uAPI. Let's document the scenario as best we
can. Thanks,
Alex
> Annotated below:
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 916cad80941c..039cba5a38ca 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -165,19 +165,27 @@ vfio_iommu_find_iommu_group(struct vfio_iommu *iommu,
> static struct vfio_dma *vfio_find_dma(struct vfio_iommu *iommu,
> dma_addr_t start, size_t size)
> {
> + // start == ~(dma_addr_t)0
> + // size == 0
> struct rb_node *node = iommu->dma_list.rb_node;
>
> while (node) {
> struct vfio_dma *dma = rb_entry(node, struct vfio_dma, node);
>
> + // never true because all dma->iova < ~(dma_addr_t)0
> if (start + size <= dma->iova)
> node = node->rb_left;
> + // traverses right until we get to the end of address space
> + // dma, then we walk off the end because
> + // ~(dma_addr_t)0 >= 0 == true
> + // node = NULL
> else if (start >= dma->iova + dma->size)
> node = node->rb_right;
> else
> return dma;
> }
>
> + // This happens
> return NULL;
> }
>
> @@ -201,6 +209,8 @@ static struct rb_node *vfio_find_dma_first_node(struct vfio_iommu *iommu,
> node = node->rb_right;
> }
> }
> + // iova >= start + size == true, due to overflow to zero => no first
> + // node found
> if (res && size && dma_res->iova >= start + size)
> res = NULL;
> return res;
> @@ -1397,6 +1407,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
> if (iova || size)
> goto unlock;
> size = U64_MAX;
> + // end of address space unmap passes these checks
> } else if (!size || size & (pgsize - 1) ||
> iova + size - 1 < iova || size > SIZE_MAX) {
> goto unlock;
> @@ -1442,18 +1453,23 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
> * mappings within the range.
> */
> if (iommu->v2 && !unmap_all) {
> + // passes this check as long as the unmap start doesn't split an
> + // existing dma
> dma = vfio_find_dma(iommu, iova, 1);
> if (dma && dma->iova != iova)
> goto unlock;
>
> + // dma = NULL, we pass this check
> dma = vfio_find_dma(iommu, iova + size - 1, 0);
> if (dma && dma->iova + dma->size != iova + size)
> goto unlock;
> }
>
> ret = 0;
> + // n = NULL
> n = first_n = vfio_find_dma_first_node(iommu, iova, size);
>
> + // loop body never entered
> while (n) {
> dma = rb_entry(n, struct vfio_dma, node);
> if (dma->iova >= iova + size)
> @@ -1513,6 +1529,7 @@ static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
> /* Report how much was unmapped */
> unmap->size = unmapped;
>
> + // return 0
> return ret;
> }
>
>
Powered by blists - more mailing lists