[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276CB1BFF85D1569B1D863F8C14A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 15 Aug 2023 03:15:15 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Baolu Lu <baolu.lu@...ux.intel.com>,
Jie Ji <jijie.ji@...ux.alibaba.com>,
"dwmw2@...radead.org" <dwmw2@...radead.org>,
"joro@...tes.org" <joro@...tes.org>,
"will@...nel.org" <will@...nel.org>,
"robin.murphy@....com" <robin.murphy@....com>,
Alex Williamson <alex.williamson@...hat.com>
CC: "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"xianting.tian@...ux.alibaba.com" <xianting.tian@...ux.alibaba.com>,
"kaijieguo@...ux.alibaba.com" <kaijieguo@...ux.alibaba.com>,
"daishengdong@...h.net" <daishengdong@...h.net>
Subject: RE: [PATCH] iommu/vt-d: Atomic breakdown of IOPT into finer
granularity
> From: Baolu Lu <baolu.lu@...ux.intel.com>
> Sent: Tuesday, August 15, 2023 10:06 AM
>
> [Please allow me to include Kevin and Alex in this thread.]
>
> On 2023/8/14 20:10, Jie Ji wrote:
> > With the addition of IOMMU support for IO page fault, it's now possible
> > to unpin the memory which DMA remapping. However, the lack of support
> > for unmapping a subrange of the I/O page table (IOPT) in IOMMU can lead
> > to some issues.
>
> Is this the right contract about how iommu_map/unmap() should be used?
> If I remember it correctly, IOVA ranges should be mapped in pairs. That
> means, if a range is mapped by iommu_map(), the same range should be
> unmapped with iommu_unmap().
>
> Any misunderstanding or anything changed?
>
> >
> > For instance, a virtual machine can establish IOPT of 2M/1G for better
> > performance, while the host system enable swap and attempts to swap out
> > some 4K pages. Unfortunately, unmap subrange of the large-page mapping
> > will make IOMMU page walk to error level, and finally cause kernel crash.
>
> Sorry that I can't fully understand this use case. Are you talking about
> the nested translation where user spaces manage their own IO page
> tables? But how can those pages been swapped out?
>
It's not related to nested. I think they are interested in I/O page fault in
stage-2 so there is no need to pin the guest memory.
But I don't think this patch along makes any sense. It should be part of
a big series which enables iommufd to support stage-2 page fault, e.g.
iommufd will register a fault handler on stage-2 hwpt which first calls
handle_mm_fault() to fix cpu page table then calls iommu_map() to
setup the iova mapping. Then upon mmu notifier on any host mapping
changes from mm, iommufd calls iommu_unmap() or other helpers to
adjust the iova mapping accordingly.
the io_pagetable metadata which tracks user request is unchanged
in that process.
vfio driver needs report to iommufd whether a bound device can fully
support I/O page fault for all DMA requests (beyond what PCI PRI allows).
There are a lot to do before we need take time to review this iommu
driver specific change.
Powered by blists - more mailing lists