[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68e18f2c-79ad-45ec-99b9-99ff68ba5438@oracle.com>
Date: Mon, 6 Oct 2025 21:23:56 -0400
From: Alejandro Jimenez <alejandro.j.jimenez@...cle.com>
To: Alex Mastro <amastro@...com>, Jason Gunthorpe <jgg@...pe.ca>
Cc: Alex Williamson <alex.williamson@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] vfio: fix VFIO_IOMMU_UNMAP_DMA when end of range would
overflow u64
Hi Alex, Jason,
On 10/6/25 8:39 PM, Alex Mastro wrote:
> On Mon, Oct 06, 2025 at 07:50:39PM -0300, Jason Gunthorpe wrote:
>> Could we block right at the ioctl inputs that end at ULONG_MAX? Maybe
>> that is a good enough fix?
>
> That would be a simpler, and perhaps less risky fix than surgically fixing every
> affected place.
>
If going this way, we'd also have to deny the MAP requests. Right now we
have this problem because you can map a range ending at the top of the
address space, but not unmap it. The map operation doesn't overflow
because vfio_dma_do_map() calls vfio_link_dma() with dma->size == 0, and
it grows the size later in vfio_pin_map_dma()...
https://elixir.bootlin.com/linux/v6.17.1/source/drivers/vfio/vfio_iommu_type1.c#L1676-L1677
> If we were to do that, I think VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE should
> report coherent limits -- i.e. it should "lie" and say the last page (or
> whatever) of u64 addressable space is inaccessible so that the UAPI is coherent
> with itself. It should be legal to map/unmap iova ranges up to the limits it
> claims to support.
>
Agree.
> I have doubts that anyone actually relies on MAP_DMA-ing such
> end-of-u64-mappings in practice, so perhaps it's OK?
>
The AMD IOMMU supports a 64-bit IOVA, so when using the AMD vIOMMU with
DMA remapping enabled + VF passthrough + a Linux guest with
iommu.forcedac=1 we hit this issue since the driver (mlx5) starts
requesting mappings for IOVAs right at the top of the address space.
I mentioned this issue on the cover letter for:
https://lore.kernel.org/qemu-devel/20250919213515.917111-1-alejandro.j.jimenez@oracle.com/
and linked to my candidate fix:
https://github.com/aljimenezb/linux/commit/014be8cafe7464d278729583a2dd5d94514e2e2a
The reason why I hadn't send it to the list yet is because I noticed the
additional locations Jason mentioned earlier in the thread (e.g.
vfio_find_dma(), vfio_iommu_replay()) and wanted to put together a
reproducer that also triggered those paths.
I mentioned in the notes for the patch above why I chose a slightly more
complex method than the '- 1' approach, since there is a chance that
iova+size could also go beyond the end of the address space and actually
wrap around.
My goal was not to trust the inputs at all if possible. We could also
use check_add_overflow() if we want to add explicit error reporting in
vfio_iommu_type1 when an overflow is detected. i.e. catch bad callers
that don't sanitize their inputs as opposed to letting them handle the
error on the return path...
Thank you,
Alejandro
Powered by blists - more mailing lists