linux-kernel - Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <60b7e29853cb33d45d10101e494c7ddbe6a5abd6.camel@linux.intel.com>
Date: Tue, 04 Feb 2025 23:01:25 +0100
From: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Yonatan Maman <ymaman@...dia.com>, kherbst@...hat.com, lyude@...hat.com,
 	dakr@...hat.com, airlied@...il.com, simona@...ll.ch, leon@...nel.org, 
	jglisse@...hat.com, akpm@...ux-foundation.org, GalShalom@...dia.com, 
	dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
 linux-mm@...ck.org, 	linux-tegra@...r.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private
 pages

On Tue, 2025-02-04 at 15:16 -0400, Jason Gunthorpe wrote:
> On Tue, Feb 04, 2025 at 03:29:48PM +0100, Thomas Hellström wrote:
> > On Tue, 2025-02-04 at 09:26 -0400, Jason Gunthorpe wrote:
> > > On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote:
> > > > 
> > > 
> > > > 1) Existing users would never use the callback. They can still
> > > > rely
> > > > on
> > > > the owner check, only if that fails we check for callback
> > > > existence.
> > > > 2) By simply caching the result from the last checked
> > > > dev_pagemap,
> > > > most
> > > > callback calls could typically be eliminated.
> > > 
> > > But then you are not in the locked region so your cache is racy
> > > and
> > > invalid.
> > 
> > I'm not sure I follow? If a device private pfn handed back to the
> > caller is dependent on dev_pagemap A having a fast interconnect to
> > the
> > client, then subsequent pfns in the same hmm_range_fault() call
> > must be
> > able to make the same assumption (pagemap A having a fast
> > interconnect), else the whole result is invalid?
> 
> But what is the receiver going to do with this device private page?
> Relock it again and check again if it is actually OK? Yuk.

I'm still lost as to what would be the possible race-condition that
can't be handled in the usual way using mmu invalidations + notifier
seqno bump? Is it the fast interconnect being taken down?

/Thomas


> 
> > > > 3) As mentioned before, a callback call would typically always
> > > > be
> > > > followed by either migration to ram or a page-table update.
> > > > Compared to
> > > > these, the callback overhead would IMO be unnoticeable.
> > > 
> > > Why? Surely the normal case should be a callback saying the
> > > memory
> > > can
> > > be accessed?
> > 
> > Sure, but at least on the xe driver, that means page-table
> > repopulation
> > since the hmm_range_fault() typically originated from a page-fault.
> 
> Yes, I expect all hmm_range_fault()'s to be on page fault paths, and
> we'd like it to be as fast as we can in the CPU present case..
> 
> Jason