[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <522642.1680103968@warthog.procyon.org.uk>
Date: Wed, 29 Mar 2023 16:32:48 +0100
From: David Howells <dhowells@...hat.com>
To: Bernard Metzler <BMT@...ich.ibm.com>
Cc: dhowells@...hat.com, Matthew Wilcox <willy@...radead.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Christoph Hellwig <hch@...radead.org>,
Jens Axboe <axboe@...nel.dk>, Jeff Layton <jlayton@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Chuck Lever III <chuck.lever@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Tom Talpey <tom@...pey.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
Subject: Re: [RFC PATCH v2 30/48] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit
Bernard Metzler <BMT@...ich.ibm.com> wrote:
> > When transmitting data, call down into TCP using a single sendmsg with
> > MSG_SPLICE_PAGES to indicate that content should be spliced rather than
> > performing several sendmsg and sendpage calls to transmit header, data
> > pages and trailer.
> >
> > To make this work, the data is assembled in a bio_vec array and attached to
> > a BVEC-type iterator. The header and trailer (if present) are copied into
> > page fragments that can be freed with put_page().
>
> I like it a lot if it still keeps zero copy sendpage() semantics for
> the cases the driver can make use of data transfers w/o copy.
> Is 'msg.msg_flags |= MSG_SPLICE_PAGES' doing that magic?
Yes. MSG_SPLICE_PAGES indicates that you want the socket to retain your
buffer and pass it directly to the device. Note that it's just a hint,
however, pages that are unspliceable (eg. they belong to the slab) will get
copied into a page fragment instead. Further, if the device cannot support a
vector, then the hint can be ignored and all the data can be copied as normal.
> 'splicing' suggest just merging pages to me.
'splicing' as in what the splice system call does.
Unfortunately, MSG_ZEROCOPY is already a (different) thing.
> It would simplify the transmit code path substantially, also getting
> rid of kmap_local_page()/kunmap_local() sequences for multi-fragment
> sendmsg()'s.
If the ITER_ITERLIST iterator is accepted, then siw would be able to do mix
KVEC and BVEC iterators, e.g. what I did for sunrpc here:
https://lore.kernel.org/linux-fsdevel/20230329141354.516864-42-dhowells@redhat.com/T/#u
This means that in siw_tx_hdt() where I made it copy data into page fragments
using page_frag_memdup() and attach that to a bvec:
hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent;
h = page_frag_memdup(NULL, hdr, hdr_len, GFP_NOFS, ULONG_MAX);
if (!h)
goto done;
bvec_set_virt(&bvec[0], h, hdr_len);
seg = 1;
it can just set up a kvec instead.
Unfortunately, it's not so easy to get rid of all of the kmap'ing as we need
to do some of it to do the hashing.
David
Powered by blists - more mailing lists