lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZhQOgjkna94y9VBx@pop-os.localdomain>
Date: Mon, 8 Apr 2024 08:34:26 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: Christoph Hellwig <hch@...radead.org>,
	Marc Harvey <marcharvey@...gle.com>,
	"Cong Wang ." <cong.wang@...edance.com>, shakeel.butt@...ux.dev,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-alpha@...r.kernel.org,
	linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org,
	sparclinux@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
	linux-arch@...r.kernel.org, bpf@...r.kernel.org,
	linux-kselftest@...r.kernel.org, linux-media@...r.kernel.org,
	dri-devel@...ts.freedesktop.org,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Jonathan Corbet <corbet@....net>,
	Richard Henderson <richard.henderson@...aro.org>,
	Ivan Kokshaysky <ink@...assic.park.msu.ru>,
	Matt Turner <mattst88@...il.com>,
	Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
	Helge Deller <deller@....de>, Andreas Larsson <andreas@...sler.com>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Ilias Apalodimas <ilias.apalodimas@...aro.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Arnd Bergmann <arnd@...db.de>, Alexei Starovoitov <ast@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Andrii Nakryiko <andrii@...nel.org>,
	Martin KaFai Lau <martin.lau@...ux.dev>,
	Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
	Yonghong Song <yonghong.song@...ux.dev>,
	John Fastabend <john.fastabend@...il.com>,
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...gle.com>,
	Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
	David Ahern <dsahern@...nel.org>,
	Willem de Bruijn <willemdebruijn.kernel@...il.com>,
	Shuah Khan <shuah@...nel.org>,
	Sumit Semwal <sumit.semwal@...aro.org>,
	Christian König <christian.koenig@....com>,
	Pavel Begunkov <asml.silence@...il.com>, David Wei <dw@...idwei.uk>,
	Jason Gunthorpe <jgg@...pe.ca>,
	Yunsheng Lin <linyunsheng@...wei.com>,
	Shailend Chand <shailend@...gle.com>,
	Harshitha Ramamurthy <hramamurthy@...gle.com>,
	Jeroen de Borst <jeroendb@...gle.com>,
	Praveen Kaligineedi <pkaligineedi@...gle.com>
Subject: Re: [RFC PATCH net-next v6 02/15] net: page_pool: create hooks for
 custom page providers

On Mon, Apr 01, 2024 at 12:22:24PM -0700, Mina Almasry wrote:
> On Thu, Mar 28, 2024 at 12:31 AM Christoph Hellwig <hch@...radead.org> wrote:
> >
> > On Tue, Mar 26, 2024 at 01:19:20PM -0700, Mina Almasry wrote:
> > >
> > > Are you envisioning that dmabuf support would be added to the block
> > > layer
> >
> > Yes.
> >
> > > (which I understand is part of the VFS and not driver specific),
> >
> > The block layer isn't really the VFS, it's just another core stack
> > like the network stack.
> >
> > > or as part of the specific storage driver (like nvme for example)? If
> > > we can add dmabuf support to the block layer itself that sounds
> > > awesome. We may then be able to do devmem TCP on all/most storage
> > > devices without having to modify each individual driver.
> >
> > I suspect we'll still need to touch the drivers to understand it,
> > but hopefully all the main infrastructure can live in the block layer.
> >
> > > In your estimation, is adding dmabuf support to the block layer
> > > something technically feasible & acceptable upstream? I notice you
> > > suggested it so I'm guessing yes to both, but I thought I'd confirm.
> >
> > I think so, and I know there has been quite some interest to at least
> > pre-register userspace memory so that the iommu overhead can be
> > pre-loaded.  It also is a much better interface for Peer to Peer
> > transfers than what we currently have.
> >

Thanks for copying me on this. This sounds really great. 

Also P2PDMA requires PCI root complex to support this kind of direct transfer,
and IIUC dmabuf does not have such hardware dependency.

> 
> I think this is positively thrilling news for me. I was worried that
> adding devmemTCP support to storage devices would involve using a
> non-dmabuf standard of buffer sharing like pci_p2pdma_
> (drivers/pci/p2pdma.c) and that would require messy changes to
> pci_p2pdma_ that would get nacked. Also it would require adding
> pci_p2pdma_ support to devmem TCP, which is a can of worms. If adding
> dma-buf support to storage devices is feasible and desirable, that's a
> much better approach IMO. (a) it will maybe work with devmem TCP
> without any changes needed on the netdev side of things and (b)
> dma-buf support may be generically useful and a good contribution even
> outside of devmem TCP.

I think the major difference is its interface, which exposes an mmap memory
region instead of fd: https://lwn.net/Articles/906092/.

> 
> I don't have a concrete user for devmem TCP for storage devices but
> the use case is very similar to GPU and I imagine the benefits in perf
> can be significant in some setups.

We have storage use cases at ByteDance, we use NVME SSD to cache videos
transferred through network, so moving data directly from SSD to NIC
would help a lot.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ