[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250122143744.GF5556@nvidia.com>
Date: Wed, 22 Jan 2025 10:37:44 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Christian König <christian.koenig@....com>
Cc: Xu Yilun <yilun.xu@...ux.intel.com>, Christoph Hellwig <hch@....de>,
Leon Romanovsky <leonro@...dia.com>, kvm@...r.kernel.org,
dri-devel@...ts.freedesktop.org, linux-media@...r.kernel.org,
linaro-mm-sig@...ts.linaro.org, sumit.semwal@...aro.org,
pbonzini@...hat.com, seanjc@...gle.com, alex.williamson@...hat.com,
vivek.kasireddy@...el.com, dan.j.williams@...el.com, aik@....com,
yilun.xu@...el.com, linux-coco@...ts.linux.dev,
linux-kernel@...r.kernel.org, lukas@...ner.de, yan.y.zhao@...el.com,
leon@...nel.org, baolu.lu@...ux.intel.com, zhenzhong.duan@...el.com,
tao1.su@...el.com
Subject: Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked()
kAPI
On Wed, Jan 22, 2025 at 02:29:09PM +0100, Christian König wrote:
> I'm having all kind of funny phenomena with AMDs mail servers since coming
> back from xmas vacation.
:(
A few years back our IT fully migrated our email to into Office 365
cloud and gave up all the crazy half on-prem stuff they were
doing. The mail started working fully perfectly after that, as long as
you use MS's servers directly :\
> But you don't want to handle mmap() on your own, you basically don't want to
> have a VMA for this stuff at all, correct?
Right, we have no interest in mmap, VMAs or struct page in
rdma/kvm/iommu.
> > > My main interest has been what data structure is produced in the
> > > attach APIs.
> > >
> > > Eg today we have a struct dma_buf_attachment that returns a sg_table.
> > >
> > > I'm expecting some kind of new data structure, lets call it "physical
> > > list" that is some efficient coding of meta/addr/len tuples that works
> > > well with the new DMA API. Matthew has been calling this thing phyr..
>
> I would not use a data structure at all. Instead we should have something
> like an iterator/cursor based approach similar to what the new DMA API is
> doing.
I'm certainly open to this idea. There may be some technical
challenges, it is a big change from scatterlist today, and
function-pointer-per-page sounds like bad performance if there are
alot of pages..
RDMA would probably have to stuff this immediately into something like
a phyr anyhow because it needs to fully extent the thing being mapped
to figure out what the HW page size and geometry should be - that
would be trivial though, and a RDMA problem.
> > > Now, if you are asking if the current dmabuf mmap callback can be
> > > improved with the above? Maybe? phyr should have the neccessary
> > > information inside it to populate a VMA - eventually even fully
> > > correctly with all the right cachable/encrypted/forbidden/etc flags.
>
> That won't work like this.
Note I said "populate a VMA", ie a helper to build the VMA PTEs only.
> See the exporter needs to be informed about page faults on the VMA to
> eventually wait for operations to end and sync caches.
All of this would still have to be provided outside in the same way as
today.
> For example we have cases with multiple devices are in the same IOMMU domain
> and re-using their DMA address mappings.
IMHO this is just another flavour of "private" address flow between
two cooperating drivers.
It is not a "dma address" in the sense of a dma_addr_t that was output
from the DMA API. I think that subtle distinction is very
important. When I say pfn/dma address I'm really only talking about
standard DMA API flows, used by generic drivers.
IMHO, DMABUF needs a private address "escape hatch", and cooperating
drivers should do whatever they want when using that flow. The address
is *fully private*, so the co-operating drivers can do whatever they
want. iommu_map in exporter and pass an IOVA? Fine! pass a PFN and
iommu_map in the importer? Also fine! Private is private.
> > But in theory it should be possible to use phyr everywhere eventually, as
> > long as there's no obviously api-rules-breaking way to go from a phyr back to
> > a struct page even when that exists.
>
> I would rather say we should stick to DMA addresses as much as possible.
I remain skeptical of this.. Aside from all the technical reasons I
already outlined..
I think it is too much work to have the exporters conditionally build
all sorts of different representations of the same thing depending on
the importer. Like having alot of DRM drivers generate both a PFN and
DMA mapped list in their export code doesn't sound very appealing to
me at all.
It makes sense that a driver would be able to conditionally generate
private and generic based on negotiation, but IMHO, not more than one
flavour of generic..
Jason
Powered by blists - more mailing lists