[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220531162152.GH1343366@nvidia.com>
Date: Tue, 31 May 2022 13:21:52 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Baolu Lu <baolu.lu@...ux.intel.com>,
Joerg Roedel <joro@...tes.org>,
Kevin Tian <kevin.tian@...el.com>,
Ashok Raj <ashok.raj@...el.com>,
Christoph Hellwig <hch@...radead.org>,
Will Deacon <will@...nel.org>, Liu Yi L <yi.l.liu@...el.com>,
Jacob jun Pan <jacob.jun.pan@...el.com>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in
debugfs
On Tue, May 31, 2022 at 05:01:46PM +0100, Robin Murphy wrote:
> The DMA API doesn't need locking, partly since it can trust itself not to do
> stupid things, and mostly because it's DMA API performance that's
> fundamentally incompatible with serialisation anyway. Why do you think we
> have a complicated per-CPU IOVA caching mechanism, if not to support big
> multi-queue devices with multiple CPU threads mapping/unmapping in different
> parts of the same DMA domain concurrently?
Well, per-CPU is a form of locking.
So what are the actual locking rules here? We can call map/unmap
concurrently but not if ... ?
IOVA overlaps?
And we expect the iommu driver to be unable to free page table levels
that have IOVA boundaries in them?
> The simpler drivers already serialise on a per-domain lock internally, while
> the more performance-focused ones implement lock-free atomic pagetable
> management in a similar style to CPU arch code; either way it should work
> fine as-is.
The mm has page table locks at every level and generally expects them
to be held for a lot of manipulations. There are some lockless cases,
but it is not as aggressive as this sounds.
> The difference with debugfs is that it's a completely orthogonal
> side-channel - an iommu_domain user like VFIO or iommu-dma can make sure its
> *own* API usage is sane, but can't be aware of the user triggering some
> driver-internal introspection of that domain in a manner that could race
> more harmfully.
The mm solution to this problem is to RCU free the page table
levels. This way something like debugfs can read a page table under
RCU completely safely, though incoherently, and there is no
performance cost on the map/unmap fast path side.
Today struct page has a rcu_head that can be used to rcu free it, so
it costs nothing.
Jason
Powered by blists - more mailing lists