[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5a858e00-6fea-4a7a-93be-f23b66e00835@amd.com>
Date: Wed, 8 Jan 2025 16:25:54 +0100
From: Christian König <christian.koenig@....com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Christoph Hellwig <hch@....de>, Leon Romanovsky <leonro@...dia.com>,
Xu Yilun <yilun.xu@...ux.intel.com>, kvm@...r.kernel.org,
dri-devel@...ts.freedesktop.org, linux-media@...r.kernel.org,
linaro-mm-sig@...ts.linaro.org, sumit.semwal@...aro.org,
pbonzini@...hat.com, seanjc@...gle.com, alex.williamson@...hat.com,
vivek.kasireddy@...el.com, dan.j.williams@...el.com, aik@....com,
yilun.xu@...el.com, linux-coco@...ts.linux.dev,
linux-kernel@...r.kernel.org, lukas@...ner.de, yan.y.zhao@...el.com,
daniel.vetter@...ll.ch, leon@...nel.org, baolu.lu@...ux.intel.com,
zhenzhong.duan@...el.com, tao1.su@...el.com
Subject: Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked()
kAPI
Am 08.01.25 um 15:58 schrieb Jason Gunthorpe:
> On Wed, Jan 08, 2025 at 02:44:26PM +0100, Christian König wrote:
>
>>> Having the importer do the mapping is the correct way to operate the
>>> DMA API and the new API that Leon has built to fix the scatterlist
>>> abuse in dmabuf relies on importer mapping as part of it's
>>> construction.
>> Exactly on that I strongly disagree on.
>>
>> DMA-buf works by providing DMA addresses the importer can work with and
>> *NOT* the underlying location of the buffer.
> The expectation is that the DMA API will be used to DMA map (most)
> things, and the DMA API always works with a physaddr_t/pfn
> argument. Basically, everything that is not a private address space
> should be supported by improving the DMA API. We are on course for
> finally getting all the common cases like P2P and MMIO solved
> here. That alone will take care of alot.
Well, from experience the DMA API has failed more often than it actually
worked in the way required by drivers.
Especially that we tried to hide architectural complexity in there
instead of properly expose limitations to drivers is not something I
consider a good design approach.
So I see putting even more into that extremely critical.
> For P2P cases we are going toward (PFN + P2P source information) as
> input to the DMA API. The additional "P2P source information" provides
> a good way for co-operating drivers to represent private address
> spaces as well. Both importer and exporter can have full understanding
> what is being mapped and do the correct things, safely.
I can say from experience that this is clearly not going to work for all
use cases.
It would mean that we have to pull a massive amount of driver specific
functionality into the DMA API.
Things like programming access windows for PCI BARs is completely driver
specific and as far as I can see can't be part of the DMA API without
things like callbacks.
With that in mind the DMA API would become a mid layer between different
drivers and that is really not something you are suggesting, isn't it?
> So, no, we don't loose private address space support when moving to
> importer mapping, in fact it works better because the importer gets
> more information about what is going on.
Well, sounds like I wasn't able to voice my concern. Let me try again:
We should not give importers information they don't need. Especially not
information about the backing store of buffers.
So that importers get more information about what's going on is a bad thing.
> I have imagined a staged approach were DMABUF gets a new API that
> works with the new DMA API to do importer mapping with "P2P source
> information" and a gradual conversion.
To make it clear as maintainer of that subsystem I would reject such a
step with all I have.
We have already gone down that road and it didn't worked at all and was
a really big pain to pull people back from it.
> Exporter mapping falls down in too many cases already:
>
> 1) Private addresses spaces don't work fully well because many devices
> need some indication what address space is being used and scatter list
> can't really properly convey that. If the DMABUF has a mixture of CPU
> and private it becomes a PITA
Correct, yes. That's why I said that scatterlist was a bad choice for
the interface.
But exposing the backing store to importers and then let them do
whatever they want with it sounds like an even worse idea.
> 2) Multi-path PCI can require the importer to make mapping decisions
> unique to the device and program device specific information for the
> multi-path. We are doing this in mlx5 today and have hacks because
> DMABUF is destroying the information the importer needs to choose the
> correct PCI path.
That's why the exporter gets the struct device of the importer so that
it can plan how those accesses are made. Where exactly is the problem
with that?
When you have an use case which is not covered by the existing DMA-buf
interfaces then please voice that to me and other maintainers instead of
implementing some hack.
> 3) Importing devices need to know if they are working with PCI P2P
> addresses during mapping because they need to do things like turn on
> ATS on their DMA. As for multi-path we have the same hacks inside mlx5
> today that assume DMABUFs are always P2P because we cannot determine
> if things are P2P or not after being DMA mapped.
Why would you need ATS on PCI P2P and not for system memory accesses?
> 4) TPH bits needs to be programmed into the importer device but are
> derived based on the NUMA topology of the DMA target. The importer has
> no idea what the DMA target actually was because the exporter mapping
> destroyed that information.
Yeah, but again that is completely intentional.
I assume you mean TLP processing hints when you say TPH and those should
be part of the DMA addresses provided by the exporter.
That an importer tries to look behind the curtain and determines the
NUMA placement and topology themselves is clearly a no-go from the
design perspective.
> 5) iommufd and kvm are both using CPU addresses without DMA. No
> exporter mapping is possible
We have customers using both KVM and XEN with DMA-buf, so I can clearly
confirm that this isn't true.
Regards,
Christian.
>
> Jason
Powered by blists - more mailing lists