lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1143687.1750755725@warthog.procyon.org.uk>
Date: Tue, 24 Jun 2025 10:02:05 +0100
From: David Howells <dhowells@...hat.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: dhowells@...hat.com, Andrew Lunn <andrew@...n.ch>,
    Eric Dumazet <edumazet@...gle.com>,
    "David S. Miller" <davem@...emloft.net>,
    Jakub Kicinski <kuba@...nel.org>,
    David Hildenbrand <david@...hat.com>,
    John Hubbard <jhubbard@...dia.com>,
    Mina Almasry <almasrymina@...gle.com>, willy@...radead.org,
    Christian Brauner <brauner@...nel.org>,
    Al Viro <viro@...iv.linux.org.uk>, netdev@...r.kernel.org,
    linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
    linux-kernel@...r.kernel.org, Leon Romanovsky <leon@...nel.org>,
    Logan Gunthorpe <logang@...tatee.com>,
    Jason Gunthorpe <jgg@...dia.com>
Subject: Re: How to handle P2P DMA with only {physaddr,len} in bio_vec?

Christoph Hellwig <hch@...radead.org> wrote:

> On Mon, Jun 23, 2025 at 11:50:58AM +0100, David Howells wrote:
> > What's the best way to manage this without having to go back to the page
> > struct for every DMA mapping we want to make?
> 
> There isn't a very easy way.  Also because if you actually need to do
> peer to peer transfers, you right now absolutely need the page to find
> the pgmap that has the information on how to perform the peer to peer
> transfer.

Are you expecting P2P to become particularly common?  Because page struct
lookups will become more expensive because we'll have to do type checking and
Willy may eventually move them from a fixed array into a maple tree - so if we
can record the P2P flag in the bio_vec, it would help speed up the "not P2P"
case.

> > Do we need to have
> > iov_extract_user_pages() note this in the bio_vec?
> > 
> > 	struct bio_vec {
> > 		physaddr_t	bv_base_addr;	/* 64-bits */
> > 		size_t		bv_len:56;	/* Maybe just u32 */
> > 		bool		p2pdma:1;	/* Region is involved in P2P */
> > 		unsigned int	spare:7;
> > 	};
> 
> Having a flag in the bio_vec might be a way to shortcut the P2P or not
> decision a bit.  The downside is that without the flag, the bio_vec
> in the brave new page-less world would actually just be:
> 
> 	struct bio_vec {
> 		phys_addr_t	bv_phys;
> 		u32		bv_len;
> 	} __packed;
> 
> i.e. adding any more information would actually increase the size from
> 12 bytes to 16 bytes for the usualy 64-bit phys_addr_t setups, and thus
> undo all the memory savings that this move would provide.

Do we actually need 32 bits for bv_len, especially given that MAX_RW_COUNT is
capped at a bit less than 2GiB?  Could we, say, do:

 	struct bio_vec {
 		phys_addr_t	bv_phys;
 		u32		bv_len:31;
		u32		bv_use_p2p:1;
 	} __packed;

And rather than storing the how-to-do-P2P info in the page struct, does it
make sense to hold it separately, keyed on bv_phys?

Also, is it possible for the networking stack, say, to trivially map the P2P
memory in order to checksum it?  I presume bv_phys in that case would point to
a mapping of device memory?

Thanks,
David


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ