[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CABdmKX0KRA3NHiEJYsq5LqtwwEdM4LaONpyukd6zgk7hHzp3Cg@mail.gmail.com>
Date: Thu, 9 May 2024 11:32:38 -0700
From: "T.J. Mercier" <tjmercier@...gle.com>
To: Christoph Hellwig <hch@....de>
Cc: Catalin Marinas <catalin.marinas@....com>, Marek Szyprowski <m.szyprowski@...sung.com>,
Robin Murphy <robin.murphy@....com>, isaacmanjarres@...gle.com, iommu@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct
On Thu, May 9, 2024 at 6:07 AM Christoph Hellwig <hch@....de> wrote:
>
> On Thu, May 09, 2024 at 08:49:40AM +0100, Catalin Marinas wrote:
> > I see the swiotlb use as some internal detail of the DMA API
> > implementation that should not leak outside this framework.
>
> And that's what it is.
>
> > I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed.
> > However, this is not sufficient with a proper use of the DMA API since
> > the first dma_map_*() without this attribute can still do the bouncing.
> > IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will
> > be used on the first map and potentially on subsequent calls in
> > combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter
> > to imply "shared"). The downside is that mapping may fail if the
> > coherent mask is too narrow.
>
> We have two big problems here that kinda interact:
>
> 1) DMA_ATTR_SKIP_CPU_SYNC is just a horrible API. It exposes an
> implementation detail instead of dealing with use cases.
> The original one IIRC was to deal with networking receive
> buffers that are often only partially filled and the networking
> folks wanted to avoid the overhead for doing the cache operations
> for the rest. It kinda works for that but already gets iffy
> when swiotlb is involved. The other abuses of the flag just
> went downhill form there.
>
> 2) the model of dma mapping a single chunk of memory to multiple
> devices is not really well accounted for in the DMA API.
>
> So for two we need a memory allocator that can take the constraints
> of multiple devices into account, and probably a way to fail a
> dma-buf attach when the importer can't address the memory.
> We also then need to come up with a memory ownership / cache
> maintenance protocol that works for this use case.
Being able to fail the attach without necessarily performing any
mapping yet would be an improvement. However I think the original idea
was for dmabuf exporters to perform the constraint solving (if
possible) as attachments get added and then finally allocate however
is best when the buffer is first mapped. But as far as I know there
are no exporters that currently do this. Instead I think the problem
is currently being avoided by using custom exporters for particular
sets of usecases that are known to work on a given system. This
swiotlb + uncached example is one reason we'd want to fail the
constraint solving. The DMA API knows about the swiotlb part but not
really about the uncached part.
Powered by blists - more mailing lists