[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a63db6a8-e9d8-4f79-8212-8710ce2e60f4@linux.alibaba.com>
Date: Wed, 29 Oct 2025 22:44:31 +0800
From: Shuai Xue <xueshuai@...ux.alibaba.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: iommu@...ts.linux.dev, kevin.tian@...el.com, joro@...tes.org,
 will@...nel.org, robin.murphy@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] iommu: iommufd: Explicitly check for VM_PFNMAP in
 iommufd_ioas_map
在 2025/10/29 21:34, Jason Gunthorpe 写道:
> On Wed, Oct 29, 2025 at 08:52:26PM +0800, Shuai Xue wrote:
>> The iommufd_ioas_map function currently returns -EFAULT when attempting
>> to map VM_PFNMAP VMAs because pin_user_pages_fast() cannot handle such
>> mappings. This error code is misleading and does not accurately reflect
>> the nature of the failure.
Hi, Jason,
> 
> Sure, but why do you care? Userspace should know not to do this based
> on how it created the mmaps, not rely on errnos to figure it out after
> the fact.
We run different VMMs (QEMU, Kata Containers) to meet diverse business
requirements, while our production environment deploys various evolving
kernel versions. Additionally, we are migrating from VFIO Type 1 to
IOMMUFD. Although IOMMUFD claims to provide compatible
iommufd_vfio_ioctl APIs, these APIs are not fully compatible in
practice. For example, with VFIO_IOMMU_MAP_DMA, iommufd_vfio_map_dma
doesn't support MMIO mapping, and we can only rely on the implicit
EFAULT error from pin_user_pages_fast(). (I initially considered adding
explicit checks in iommufd_vfio_map_dma, but I noticed you plan to add
dma_buf support there.)
While we certainly aim for a seamless migration from VFIO Type 1 to
IOMMUFD, as you know, this isn't always feasible.
For GPU-related issues encountered in production, the debugging path is
quite long - from business teams to virtualization teams, and finally to
our kernel team.
Therefore, having explicit checks with deterministic error codes
returned to userspace would be greatly appreciated.
> 
>> +static bool iommufd_check_vm_pfnmap(unsigned long vaddr)
>> +{
>> +	struct mm_struct *mm = current->mm;
>> +	struct vm_area_struct *vma;
>> +	bool ret = false;
>> +
>> +	mmap_read_lock(mm);
>> +	vaddr = untagged_addr_remote(mm, vaddr);
>> +	vma = vma_lookup(mm, vaddr);
>> +	if (vma && vma->vm_flags & VM_PFNMAP)
>> +		ret = true;
>> +	mmap_read_unlock(mm);
> 
> This isn't really sufficient, the range can span multiple VMAs and you
> can hit special PTEs in PFNMAPs, or you can hit P2P struct pages in
> fully normal VMAs.
> 
> I think if you really want this errno distinction it should come from
> pin_user_pages() directly as only it knows the reason it didn't work.
> 
Aha, I see. Thank you for pointing out this issue. The check indeed
needs to be more comprehensive. Do you mind use pin_user_pages() as a
precheck?
Thanks for quick reply.
Best Regards,
Shuai
Powered by blists - more mailing lists
 
