[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250129134757.GA2120662@ziepe.ca>
Date: Wed, 29 Jan 2025 09:47:57 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Thomas Hellström <thomas.hellstrom@...ux.intel.com>,
Yonatan Maman <ymaman@...dia.com>, kherbst@...hat.com,
lyude@...hat.com, dakr@...hat.com, airlied@...il.com,
simona@...ll.ch, leon@...nel.org, jglisse@...hat.com,
akpm@...ux-foundation.org, GalShalom@...dia.com,
dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-mm@...ck.org, linux-tegra@...r.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private
pages
On Wed, Jan 29, 2025 at 02:38:58PM +0100, Simona Vetter wrote:
> > The pgmap->owner doesn't *have* to fixed, certainly during early boot before
> > you hand out any page references it can be changed. I wouldn't be
> > surprised if this is useful to some requirements to build up the
> > private interconnect topology?
>
> The trouble I'm seeing is device probe and the fundemantal issue that you
> never know when you're done. And so if we entirely rely on pgmap->owner to
> figure out the driver private interconnect topology, that's going to be
> messy. That's why I'm also leaning towards both comparing owners and
> having an additional check whether the interconnect is actually there or
> not yet.
Hoenstely, I'd rather invest more effort into being able to update
owner for those special corner cases than to slow down the fast path
in hmm_range_fault..
The notion is that owner should represent a contiguous region of
connectivity. IMHO you can always create this, but I suppose there
could be corner cases where you need to split/merge owners.
But again, this isn't set in stone, if someone has a better way to
match the private interconnects without going to driver callbacks then
try that too.
I think driver callbacks inside hmm_range_fault should be the last resort..
> You can fake that by doing these checks after hmm_range_fault returned,
> and if you get a bunch of unsuitable pages, toss it back to
> hmm_range_fault asking for an unconditional migration to system memory for
> those. But that's kinda not great and I think goes at least against the
> spirit of how you want to handle pci p2p in step 2 below?
Right, hmm_range_fault should return pages that can be used and you
should not call it twice.
Jason
Powered by blists - more mailing lists