[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82688fc5-c294-1db3-9a05-fecf9b9b5e17@arm.com>
Date: Mon, 2 Jul 2018 14:06:02 +0100
From: Robin Murphy <robin.murphy@....com>
To: benh@....ibm.com, Christoph Hellwig <hch@....de>
Cc: Russell Currey <ruscur@....ibm.com>,
iommu@...ts.linux-foundation.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Jens Axboe <jens.axboe@...cle.com>
Subject: Re: DMA mappings and crossing boundaries
Hi Ben,
On 24/06/18 08:32, Benjamin Herrenschmidt wrote:
> Hi Folks !
>
> So due work around issues with devices having to strict limitations in
> DMA address bits (GPUs ugh....) on POWER, we've been playing with a
> mechanism that does dynamic mapping in the IOMMU but using a very large
> IOMMU page size (256M on POWER8 and 1G on POWER9) for performances.
>
> Now, with such page size, we can't just pop out new entries for every
> DMA map, we need to try to re-use entries for mappings in the same
> "area".
>
> We've prototypes something using refcounts on the entires. It does
> imply some locking which is potentially problematic, and we'll be
> looking at options there long run, but it works... so far.
>
> My worry is that it will fail if we ever get a mapping request (or
> coherent allocation request) that spawns one of those giant pages
> boundaries. At least our current implementation.
>
> AFAIK, dma_alloc_coherent() is defined (Documentation/DMA-API-
> HOWTO.txt) as always allocating to the next power-of-2 order, so we
> should never have the problem unless we allocate a single chunk larger
> than the IOMMU page size.
(and even then it's not *that* much of a problem, since it comes down to
just finding n > 1 consecutive unused IOMMU entries for exclusive use by
that new chunk)
> For dma_map_sg() however, if a request that has a single "entry"
> spawning such a boundary, we need to ensure that the result mapping is
> 2 contiguous "large" iommu pages as well.
>
> However, that doesn't fit well with us re-using existing mappings since
> they may already exist and either not be contiguous, or partially exist
> with no free hole around them.
>
> Now, we *could* possibly construe a way to solve this by detecting this
> case and just allocating another "pair" (or set if we cross even more
> pages) of IOMMU pages elsewhere, thus partially breaking our re-use
> scheme.
>
> But while doable, this introduce some serious complexity in the
> implementation, which I would very much like to avoid.
>
> So I was wondering if you guys thought that was ever likely to happen ?
> Do you see reasonable cases where dma_map_sg() would be called with a
> list in which a single entry crosses a 256M or 1G boundary ?
For streaming mappings of buffers cobbled together out of any old CPU
pages (e.g. user memory), you may well happen to get two
physically-adjacent pages falling either side of an IOMMU boundary,
which comprise all or part of a single request - note that whilst it's
probably less likely than the scatterlist case, this could technically
happen for dma_map_{page, single}() calls too.
Conceptually it looks pretty easy to extend the allocation constraints
to cope with that - even the pathological worst case would have an
absolute upper bound of 3 IOMMU entries for any one physical region -
but if in practice it's a case of mapping arbitrary CPU pages to 32-bit
DMA addresses having only 4 1GB slots to play with, I can't really see a
way to make that practical :(
Maybe the best compromise would be some sort of hybrid scheme which
makes sure that one of the IOMMU entries always covers the SWIOTLB
buffer, and invokes software bouncing for the awkward cases.
Robin.
Powered by blists - more mailing lists