[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250128172123.GD1524382@ziepe.ca>
Date: Tue, 28 Jan 2025 13:21:23 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
Cc: Yonatan Maman <ymaman@...dia.com>, kherbst@...hat.com, lyude@...hat.com,
dakr@...hat.com, airlied@...il.com, simona@...ll.ch,
leon@...nel.org, jglisse@...hat.com, akpm@...ux-foundation.org,
GalShalom@...dia.com, dri-devel@...ts.freedesktop.org,
nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org, linux-mm@...ck.org,
linux-tegra@...r.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private
pages
On Tue, Jan 28, 2025 at 05:32:23PM +0100, Thomas Hellström wrote:
> > This series supports three case:
> >
> > 1) pgmap->owner == range->dev_private_owner
> > This is "driver private fast interconnect" in this case HMM
> > should
> > immediately return the page. The calling driver understands the
> > private parts of the pgmap and computes the private interconnect
> > address.
> >
> > This requires organizing your driver so that all private
> > interconnect has the same pgmap->owner.
>
> Yes, although that makes this map static, since pgmap->owner has to be
> set at pgmap creation time. and we were during initial discussions
> looking at something dynamic here. However I think we can probably do
> with a per-driver owner for now and get back if that's not sufficient.
The pgmap->owner doesn't *have* to fixed, certainly during early boot before
you hand out any page references it can be changed. I wouldn't be
surprised if this is useful to some requirements to build up the
private interconnect topology?
> > 2) The page is DEVICE_PRIVATE and get_dma_pfn_for_device() exists.
> > The exporting driver has the option to return a P2P struct page
> > that can be used for PCI P2P without any migration. In a PCI GPU
> > context this means the GPU has mapped its local memory to a PCI
> > address. The assumption is that P2P always works and so this
> > address can be DMA'd from.
>
> So do I understand it correctly, that the driver then needs to set up
> one device_private struct page and one pcie_p2p struct page for each
> page of device memory participating in this way?
Yes, for now. I hope to remove the p2p page eventually.
> > If you are just talking about your private multi-path, then that is
> > already handled..
>
> No, the issue I'm having with this is really why would
> hmm_range_fault() need the new pfn when it could easily be obtained
> from the device-private pfn by the hmm_range_fault() caller?
That isn't the API of HMM, the caller uses hmm to get PFNs it can use.
Deliberately returning PFNs the caller cannot use is nonsensical to
it's purpose :)
> So anyway what we'll do is to try to use an interconnect-common owner
> for now and revisit the problem if that's not sufficient so we can come
> up with an acceptable solution.
That is the intention for sure. The idea was that the drivers under
the private pages would somehow generate unique owners for shared
private interconnect segments.
I wouldn't say this is the end all of the idea, if there are better
ways to handle accepting private pages they can certainly be
explored..
Jason
Powered by blists - more mailing lists