[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <7090CB2E-8D63-44B1-A739-932FFA649BC9@linux.alibaba.com>
Date: Sat, 24 Feb 2018 13:44:07 +0800
From: jason <jason.cai@...ux.alibaba.com>
To: jason <jason.cai@...ux.alibaba.com>, alex.williamson@...hat.com,
pbonzini@...hat.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc: gnehzuil@...ux.alibaba.com
Subject: [RFC] vfio iommu type1: improve memory pinning process for raw PFN
mapping
When using vfio to pass through a PCIe device (e.g. a GPU card) that
has a huge BAR (e.g. 16GB), a lot of cycles are wasted on memory
pinning because PFNs of PCI BAR are not backed by struct page, and
the corresponding VMA has flags VM_IO|VM_PFNMAP.
With this change, memory pinning process will firstly try to figure
out whether the corresponding region is a raw PFN mapping, and if so
it can skip unnecessary user memory pinning process.
Even though it commes with a little overhead, finding vma and testing
flags, on each call, it can significantly improve VM's boot up time
when passing through devices via VFIO.
---
drivers/vfio/vfio_iommu_type1.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e30e29ae4819..1a471ece3f9c 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -374,6 +374,24 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
return ret;
}
+static int try_io_pfnmap(struct mm_struct *mm, unsigned long vaddr, long npage,
+ unsigned long *pfn)
+{
+ struct vm_area_struct *vma;
+ int pinned = 0;
+
+ down_read(&mm->mmap_sem);
+ vma = find_vma_intersection(mm, vaddr, vaddr + 1);
+ if (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)) {
+ *pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+ if (is_invalid_reserved_pfn(*pfn))
+ pinned = min(npage, (long)vma_pages(vma));
+ }
+ up_read(&mm->mmap_sem);
+
+ return pinned;
+}
+
/*
* Attempt to pin pages. We really don't want to track all the pfns and
* the iommu can only map chunks of consecutive pfns anyway, so get the
@@ -392,6 +410,10 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
if (!current->mm)
return -ENODEV;
+ ret = try_io_pfnmap(current->mm, vaddr, npage, pfn_base);
+ if (ret)
+ return ret;
+
ret = vaddr_get_pfn(current->mm, vaddr, dma->prot, pfn_base);
if (ret)
return ret;
--
2.13.6
Powered by blists - more mailing lists