linux-kernel - Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250204132615.GI2296753@ziepe.ca>
Date: Tue, 4 Feb 2025 09:26:15 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Thomas Hellström <thomas.hellstrom@...ux.intel.com>
Cc: Yonatan Maman <ymaman@...dia.com>, kherbst@...hat.com, lyude@...hat.com,
	dakr@...hat.com, airlied@...il.com, simona@...ll.ch,
	leon@...nel.org, jglisse@...hat.com, akpm@...ux-foundation.org,
	GalShalom@...dia.com, dri-devel@...ts.freedesktop.org,
	nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
	linux-rdma@...r.kernel.org, linux-mm@...ck.org,
	linux-tegra@...r.kernel.org
Subject: Re: [RFC 1/5] mm/hmm: HMM API to enable P2P DMA for device private
 pages

On Tue, Feb 04, 2025 at 10:32:32AM +0100, Thomas Hellström wrote:
> > I would not be happy to see this. Please improve pagemap directly if
> > you think you need more things.
> 
> These are mainly helpers to migrate and populate a range of cpu memory
> space (struct mm_struct) with GPU device_private memory, migrate to
> system on gpu memory shortage and implement the migrate_to_vram pagemap
> op, tied to gpu device memory allocations, so I don't think there is
> anything we should be exposing at the dev_pagemap level at this point?

Maybe that belongs in mm/hmm then?

> > Neither really match the expected design here. The owner should be
> > entirely based on reachability. Devices that cannot reach each other
> > directly should have different owners.
> 
> Actually what I'm putting together is a small helper to allocate and
> assign an "owner" based on devices that are previously registered to a
> "registry". The caller has to indicate using a callback function for
> each struct device pair whether there is a fast interconnect available,
> and this is expected to be done at pagemap creation time, so I think
> this aligns with the above. Initially a "registry" (which is a list of
> device-owner pairs) will be driver-local, but could easily have a wider
> scope.

Yeah, that seems like a workable idea

> This means we handle access control, unplug checks and similar at
> migration time, typically before hmm_range_fault(), and the role of
> hmm_range_fault() will be to over pfns whose backing memory is directly
> accessible to the device, else migrate to system.

Yes, that sound right

> 1) Existing users would never use the callback. They can still rely on
> the owner check, only if that fails we check for callback existence.
> 2) By simply caching the result from the last checked dev_pagemap, most
> callback calls could typically be eliminated.

But then you are not in the locked region so your cache is racy and
invalid.

> 3) As mentioned before, a callback call would typically always be
> followed by either migration to ram or a page-table update. Compared to
> these, the callback overhead would IMO be unnoticeable.

Why? Surely the normal case should be a callback saying the memory can
be accessed?

> 4) pcie_p2p is already planning a dev_pagemap callback?

Yes, but it is not a racy validation callback, and it already is
creating a complicated lifecycle problem inside the exporting the
driver.

Jason