[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFlaxwpKChYXFf8A@infradead.org>
Date: Mon, 23 Jun 2025 06:46:47 -0700
From: Christoph Hellwig <hch@...radead.org>
To: David Howells <dhowells@...hat.com>
Cc: Christoph Hellwig <hch@...radead.org>, Andrew Lunn <andrew@...n.ch>,
Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
David Hildenbrand <david@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
Mina Almasry <almasrymina@...gle.com>, willy@...radead.org,
Christian Brauner <brauner@...nel.org>,
Al Viro <viro@...iv.linux.org.uk>, netdev@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, Leon Romanovsky <leon@...nel.org>,
Logan Gunthorpe <logang@...tatee.com>,
Jason Gunthorpe <jgg@...dia.com>
Subject: Re: How to handle P2P DMA with only {physaddr,len} in bio_vec?
Hi David,
On Mon, Jun 23, 2025 at 11:50:58AM +0100, David Howells wrote:
> What's the best way to manage this without having to go back to the page
> struct for every DMA mapping we want to make?
There isn't a very easy way. Also because if you actually need to do
peer to peer transfers, you right now absolutely need the page to find
the pgmap that has the information on how to perform the peer to peer
transfer.
> Do we need to have
> iov_extract_user_pages() note this in the bio_vec?
>
> struct bio_vec {
> physaddr_t bv_base_addr; /* 64-bits */
> size_t bv_len:56; /* Maybe just u32 */
> bool p2pdma:1; /* Region is involved in P2P */
> unsigned int spare:7;
> };
Having a flag in the bio_vec might be a way to shortcut the P2P or not
decision a bit. The downside is that without the flag, the bio_vec
in the brave new page-less world would actually just be:
struct bio_vec {
phys_addr_t bv_phys;
u32 bv_len;
} __packed;
i.e. adding any more information would actually increase the size from
12 bytes to 16 bytes for the usualy 64-bit phys_addr_t setups, and thus
undo all the memory savings that this move would provide.
Note that at least for the block layer the DMA mapping changes I'm about
to send out again require each bio to be either non P2P or P2P to a
specific device. It might be worth to also extend this higher level
limitation to other users if feasible.
> I'm guessing that only folio-type pages can be involved in this:
>
> static inline struct dev_pagemap *page_pgmap(const struct page *page)
> {
> VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page);
> return page_folio(page)->pgmap;
> }
>
> as only struct folio has a pointer to dev_pagemap? And I assume this is going
> to get removed from struct page itself at some point soonish.
I guess so.
Powered by blists - more mailing lists