lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 29 Mar 2023 16:32:48 +0100
From:   David Howells <dhowells@...hat.com>
To:     Bernard Metzler <BMT@...ich.ibm.com>
Cc:     dhowells@...hat.com, Matthew Wilcox <willy@...radead.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Christoph Hellwig <hch@...radead.org>,
        Jens Axboe <axboe@...nel.dk>, Jeff Layton <jlayton@...nel.org>,
        Christian Brauner <brauner@...nel.org>,
        Chuck Lever III <chuck.lever@...cle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Tom Talpey <tom@...pey.com>,
        "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
Subject: Re: [RFC PATCH v2 30/48] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit

Bernard Metzler <BMT@...ich.ibm.com> wrote:

> > When transmitting data, call down into TCP using a single sendmsg with
> > MSG_SPLICE_PAGES to indicate that content should be spliced rather than
> > performing several sendmsg and sendpage calls to transmit header, data
> > pages and trailer.
> > 
> > To make this work, the data is assembled in a bio_vec array and attached to
> > a BVEC-type iterator.  The header and trailer (if present) are copied into
> > page fragments that can be freed with put_page().
> 
> I like it a lot if it still keeps zero copy sendpage() semantics for
> the cases the driver can make use of data transfers w/o copy. 
> Is 'msg.msg_flags |= MSG_SPLICE_PAGES' doing that magic?

Yes.  MSG_SPLICE_PAGES indicates that you want the socket to retain your
buffer and pass it directly to the device.  Note that it's just a hint,
however, pages that are unspliceable (eg. they belong to the slab) will get
copied into a page fragment instead.  Further, if the device cannot support a
vector, then the hint can be ignored and all the data can be copied as normal.

> 'splicing' suggest just merging pages to me.

'splicing' as in what the splice system call does.

Unfortunately, MSG_ZEROCOPY is already a (different) thing.

> It would simplify the transmit code path substantially, also getting
> rid of kmap_local_page()/kunmap_local() sequences for multi-fragment
> sendmsg()'s.

If the ITER_ITERLIST iterator is accepted, then siw would be able to do mix
KVEC and BVEC iterators, e.g. what I did for sunrpc here:

	https://lore.kernel.org/linux-fsdevel/20230329141354.516864-42-dhowells@redhat.com/T/#u

This means that in siw_tx_hdt() where I made it copy data into page fragments
using page_frag_memdup() and attach that to a bvec:

	hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent;
	h = page_frag_memdup(NULL, hdr, hdr_len, GFP_NOFS, ULONG_MAX);
	if (!h)
		goto done;
	bvec_set_virt(&bvec[0], h, hdr_len);
	seg = 1;

it can just set up a kvec instead.

Unfortunately, it's not so easy to get rid of all of the kmap'ing as we need
to do some of it to do the hashing.

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ