[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7lvduvov3rvfsgixbkyyinnzz3plpp3szxam46ccgjmh6v5d7q@zoz4k723vs3d>
Date: Tue, 22 Jul 2025 10:49:10 +1000
From: Alistair Popple <apopple@...dia.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Jason Gunthorpe <jgg@...pe.ca>, Yonatan Maman <ymaman@...dia.com>,
Jérôme Glisse <jglisse@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>,
Leon Romanovsky <leon@...nel.org>, Lyude Paul <lyude@...hat.com>,
Danilo Krummrich <dakr@...nel.org>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Ben Skeggs <bskeggs@...dia.com>,
Michael Guralnik <michaelgur@...dia.com>, Or Har-Toov <ohartoov@...dia.com>,
Daisuke Matsuda <dskmtsd@...il.com>, Shay Drory <shayd@...dia.com>, linux-mm@...ck.org,
linux-rdma@...r.kernel.org, dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, Gal Shalom <GalShalom@...dia.com>
Subject: Re: [PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device
private pages
On Mon, Jul 21, 2025 at 02:23:13PM +0100, Matthew Wilcox wrote:
> On Fri, Jul 18, 2025 at 11:44:42AM -0300, Jason Gunthorpe wrote:
> > On Fri, Jul 18, 2025 at 03:17:00PM +0100, Matthew Wilcox wrote:
> > > On Fri, Jul 18, 2025 at 02:51:08PM +0300, Yonatan Maman wrote:
> > > > +++ b/include/linux/memremap.h
> > > > @@ -89,6 +89,14 @@ struct dev_pagemap_ops {
> > > > */
> > > > vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
> > > >
> > > > + /*
> > > > + * Used for private (un-addressable) device memory only. Return a
> > > > + * corresponding PFN for a page that can be mapped to device
> > > > + * (e.g using dma_map_page)
> > > > + */
> > > > + int (*get_dma_pfn_for_device)(struct page *private_page,
> > > > + unsigned long *dma_pfn);
> > >
> > > This makes no sense. If a page is addressable then it has a PFN.
> > > If a page is not addressable then it doesn't have a PFN.
> >
> > The DEVICE_PRIVATE pages have a PFN, but it is not usable for
> > anything.
>
> OK, then I don't understand what DEVICE PRIVATE means.
>
> I thought it was for memory on a PCIe device that isn't even visible
> through a BAR and so the CPU has no way of addressing it directly.
Correct.
> But now you say that it has a PFN, which means it has a physical
> address, which means it's accessible to the CPU.
Having a PFN doesn't mean it's actually accessible to the CPU. It is a real
physical address in the CPU address space, but it is a completely bogus/invalid
address - if the CPU actually tries to access it will cause a machine check
or whatever other exception gets generated when accessing an invalid physical
address.
Obviously we're careful to avoid that. The PFN is used solely to get to/from a
struct page (via pfn_to_page() or page_to_pfn()).
> So what is it?
IMHO a hack, because obviously we shouldn't require real physical addresses for
something the CPU can't actually address anyway and this causes real problems
(eg. it doesn't actually work on anything other than x86_64). There's no reason
the "PFN" we store in device-private entries couldn't instead just be an index
into some data structure holding pointers to the struct pages. So instead of
using pfn_to_page()/page_to_pfn() we would use device_private_index_to_page()
and page_to_device_private_index().
We discussed this briefly at LSFMM, I think your suggestion for a data structure
was to use a maple tree. I'm yet to look at this more deeply but I'd like to
figure out where memdescs fit in this picture too.
- Alistair
> > This is effectively converting from a DEVICE_PRIVATE page to an actual
> > DMA'able address of some kind. The DEVICE_PRIVATE is just a non-usable
> > proxy, like a swap entry, for where the real data is sitting.
> >
> > Jason
> >
Powered by blists - more mailing lists