[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <60b7e29853cb33d45d10101e494c7ddbe6a5abd6.camel@linux.intel.com>
Date: Tue, 04 Feb 2025 23:01:25 +0100
From: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Yonatan Maman <ymaman@...dia.com>, kherbst@...hat.com, lyude@...hat.com,
dakr@...hat.com, airlied@...il.com, simona@...ll.ch, leon@...nel.org,
jglisse@...hat.com, akpm@...ux-foundation.org, GalShalom@...dia.com,
dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
linux-mm@...ck.org, linux-tegra@...r.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private
pages
On Tue, 2025-02-04 at 15:16 -0400, Jason Gunthorpe wrote:
> On Tue, Feb 04, 2025 at 03:29:48PM +0100, Thomas Hellström wrote:
> > On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe wrote:
> > > On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote:
> > > >
> > >
> > > > 1) Existing users would never use the callback. They can still
> > > > rely
> > > > on
> > > > the owner check, only if that fails we check for callback
> > > > existence.
> > > > 2) By simply caching the result from the last checked
> > > > dev_pagemap,
> > > > most
> > > > callback calls could typically be eliminated.
> > >
> > > But then you are not in the locked region so your cache is racy
> > > and
> > > invalid.
> >
> > I'm not sure I follow? If a device private pfn handed back to the
> > caller is dependent on dev_pagemap A having a fast interconnect to
> > the
> > client, then subsequent pfns in the same hmm_range_fault() call
> > must be
> > able to make the same assumption (pagemap A having a fast
> > interconnect), else the whole result is invalid?
>
> But what is the receiver going to do with this device private page?
> Relock it again and check again if it is actually OK? Yuk.
I'm still lost as to what would be the possible race-condition that
can't be handled in the usual way using mmu invalidations + notifier
seqno bump? Is it the fast interconnect being taken down?
/Thomas
>
> > > > 3) As mentioned before, a callback call would typically always
> > > > be
> > > > followed by either migration to ram or a page-table update.
> > > > Compared to
> > > > these, the callback overhead would IMO be unnoticeable.
> > >
> > > Why? Surely the normal case should be a callback saying the
> > > memory
> > > can
> > > be accessed?
> >
> > Sure, but at least on the xe driver, that means page-table
> > repopulation
> > since the hmm_range_fault() typically originated from a page-fault.
>
> Yes, I expect all hmm_range_fault()'s to be on page fault paths, and
> we'd like it to be as fast as we can in the CPU present case..
>
> Jason
Powered by blists - more mailing lists