lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZKNA9Pkg2vMJjHds@ziepe.ca>
Date: Mon, 3 Jul 2023 18:43:16 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Mina Almasry <almasrymina@...gle.com>
Cc: David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
	Jesper Dangaard Brouer <jbrouer@...hat.com>, brouer@...hat.com,
	Alexander Duyck <alexander.duyck@...il.com>,
	Yunsheng Lin <linyunsheng@...wei.com>, davem@...emloft.net,
	pabeni@...hat.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, Lorenzo Bianconi <lorenzo@...nel.org>,
	Yisen Zhuang <yisen.zhuang@...wei.com>,
	Salil Mehta <salil.mehta@...wei.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Sunil Goutham <sgoutham@...vell.com>,
	Geetha sowjanya <gakula@...vell.com>,
	Subbaraya Sundeep <sbhatta@...vell.com>,
	hariprasad <hkelam@...vell.com>, Saeed Mahameed <saeedm@...dia.com>,
	Leon Romanovsky <leon@...nel.org>, Felix Fietkau <nbd@....name>,
	Ryder Lee <ryder.lee@...iatek.com>,
	Shayne Chen <shayne.chen@...iatek.com>,
	Sean Wang <sean.wang@...iatek.com>, Kalle Valo <kvalo@...nel.org>,
	Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Ilias Apalodimas <ilias.apalodimas@...aro.org>,
	linux-rdma@...r.kernel.org, linux-wireless@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org,
	linux-mediatek@...ts.infradead.org,
	Jonathan Lemon <jonathan.lemon@...il.com>
Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5]
 page_pool: remove PP_FLAG_PAGE_FRAG flag)

On Sun, Jul 02, 2023 at 11:22:33PM -0700, Mina Almasry wrote:
> On Sun, Jul 2, 2023 at 9:20 PM David Ahern <dsahern@...nel.org> wrote:
> >
> > On 6/29/23 8:27 PM, Mina Almasry wrote:
> > >
> > > Hello Jakub, I'm looking into device memory (peer-to-peer) networking
> > > actually, and I plan to pursue using the page pool as a front end.
> > >
> > > Quick description of what I have so far:
> > > current implementation uses device memory with struct pages; I am
> > > putting all those pages in a gen_pool, and we have written an
> > > allocator that allocates pages from the gen_pool. In the driver, we
> > > use this allocator instead of alloc_page() (the driver in question is
> > > gve which currently doesn't use the page pool). When the driver is
> > > done with the p2p page, it simply decrements the refcount on it and
> > > the page is freed back to the gen_pool.
> 
> Quick update here, I was able to get my implementation working with
> the page pool as a front end with the memory provider API Jakub wrote
> here:
> https://github.com/kuba-moo/linux/tree/pp-providers
> 
> The main complication indeed was the fact that my device memory pages
> are ZONE_DEVICE pages, which are incompatible with the page_pool due
> to the union in struct page. I thought of a couple of approaches to
> resolve that.
> 
> 1. Make my device memory pages non-ZONE_DEVICE pages. 

Hard no on this from a mm perspective.. We need P2P memory to be
properly tagged and have the expected struct pages to be DMA mappable
and otherwise, you totally break everything if you try to do this..

> 2. Convert the pages from ZONE_DEVICE pages to page_pool pages and
> vice versa as they're being inserted and removed from the page pool.

This is kind of scary, it is very, very, fragile to rework the pages
like this. Eg what happens when the owning device unplugs and needs to
revoke these pages? I think it would likely crash..

I think it also technically breaks the DMA API as we may need to look
into the pgmap to do cache ops on some architectures.

I suggest you try to work with 8k folios and then the tail page's
struct page is empty enough to store the information you need..
Or allocate per page memory and do a memdesc like thing..

Though overall, you won't find devices creating struct pages for their
P2P memory today, so I'm not sure what the purpose is. Jonathan
already got highly slammed for proposing code to the kernel that was
unusable. Please don't repeat that. Other than a special NVMe use case
the interface for P2P is DMABUF right now and it is not struct page
backed.

Even if we did get to struct pages for device memory, it is highly
likely cases you are interested in will be using larger than 4k
folios, so page pool would need to cope with this nicely as well.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ