linux-kernel - Re: [PATCH] vfio/pci: Align huge faults to order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20250506141229.363a7462.alex.williamson@redhat.com>
Date: Tue, 6 May 2025 14:12:29 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: alex.williamson@...hat.com, peterx@...hat.com
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, Adolfo
 <adolfotregosa@...il.com>, stable@...r.kernel.org
Subject: Re: [PATCH] vfio/pci: Align huge faults to order

On Fri,  2 May 2025 16:40:31 -0600
Alex Williamson <alex.williamson@...hat.com> wrote:

> The vfio-pci huge_fault handler doesn't make any attempt to insert a
> mapping containing the faulting address, it only inserts mappings if the
> faulting address and resulting pfn are aligned.  This works in a lot of
> cases, particularly in conjunction with QEMU where DMA mappings linearly
> fault the mmap.  However, there are configurations where we don't get
> that linear faulting and pages are faulted on-demand.
> 
> The scenario reported in the bug below is such a case, where the physical
> address width of the CPU is greater than that of the IOMMU, resulting in a
> VM where guest firmware has mapped device MMIO beyond the address width of
> the IOMMU.  In this configuration, the MMIO is faulted on demand and
> tracing indicates that occasionally the faults generate a VM_FAULT_OOM.
> Given the use case, this results in a "error: kvm run failed Bad address",
> killing the VM.
> 
> The host is not under memory pressure in this test, therefore it's
> suspected that VM_FAULT_OOM is actually the result of a NULL return from
> __pte_offset_map_lock() in the get_locked_pte() path from insert_pfn().
> This suggests a potential race inserting a pte concurrent to a pmd, and
> maybe indicates some deficiency in the mm layer properly handling such a
> case.
> 
> Nevertheless, Peter noted the inconsistency of vfio-pci's huge_fault
> handler where our mapping granularity depends on the alignment of the
> faulting address relative to the order rather than aligning the faulting
> address to the order to more consistently insert huge mappings.  This
> change not only uses the page tables more consistently and efficiently, but
> as any fault to an aligned page results in the same mapping, the race
> condition suspected in the VM_FAULT_OOM is avoided.
> 
> Reported-by: Adolfo <adolfotregosa@...il.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=220057
> Fixes: 09dfc8a5f2ce ("vfio/pci: Fallback huge faults for unaligned pfn")
> Cc: stable@...r.kernel.org
> Tested-by: Adolfo <adolfotregosa@...il.com>
> Co-developed-by: Peter Xu <peterx@...hat.com>
> Signed-off-by: Alex Williamson <alex.williamson@...hat.com>
> ---
>  drivers/vfio/pci/vfio_pci_core.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 35f9046af315..6328c3a05bcd 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1646,14 +1646,14 @@ static vm_fault_t vfio_pci_mmap_huge_fault(struct vm_fault *vmf,
>  {
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct vfio_pci_core_device *vdev = vma->vm_private_data;
> -	unsigned long pfn, pgoff = vmf->pgoff - vma->vm_pgoff;
> +	unsigned long addr = vmf->address & ~((PAGE_SIZE << order) - 1);
> +	unsigned long pgoff = (addr - vma->vm_start) >> PAGE_SHIFT;
> +	unsigned long pfn = vma_to_pfn(vma) + pgoff;
>  	vm_fault_t ret = VM_FAULT_SIGBUS;
>  
> -	pfn = vma_to_pfn(vma) + pgoff;
> -
> -	if (order && (pfn & ((1 << order) - 1) ||
> -		      vmf->address & ((PAGE_SIZE << order) - 1) ||
> -		      vmf->address + (PAGE_SIZE << order) > vma->vm_end)) {
> +	if (order && (addr < vma->vm_start ||
> +		      addr + (PAGE_SIZE << order) > vma->vm_end ||
> +		      pfn & ((1 << order) - 1))) {
>  		ret = VM_FAULT_FALLBACK;
>  		goto out;
>  	}

Applied to vfio for-linus branch for v6.15.  Added Peter as a Sign-off
to be compatible with the Co-developed tag, in agreement with Peter.
Modified the bz Link: to Closes: for checkpatch.  Thanks,

Alex