[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3636418.1680778595@warthog.procyon.org.uk>
Date: Thu, 06 Apr 2023 11:56:35 +0100
From: David Howells <dhowells@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: dhowells@...hat.com, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Matthew Wilcox <willy@...radead.org>,
Al Viro <viro@...iv.linux.org.uk>,
Christoph Hellwig <hch@...radead.org>,
Jens Axboe <axboe@...nel.dk>, Jeff Layton <jlayton@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Chuck Lever III <chuck.lever@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH net-next v5 00/19] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1
Eric Dumazet <edumazet@...gle.com> wrote:
> > Here's the first tranche of patches towards providing a MSG_SPLICE_PAGES
> > internal sendmsg flag that is intended to replace the ->sendpage() op with
> > calls to sendmsg(). MSG_SPLICE is a hint that tells the protocol that it
> > should splice the pages supplied if it can and copy them if not.
> >
>
> I find this patch series quite big/risky for 6.4
If you want me to hold this till after the merge window, that's fine.
> Can you spell out why we need "unspliceable pages support" ?
> This seems to add quite a lot of code in fast paths.
The patches to copy unspliceable pages (patches 6, 14 and 19) only really add
to the MSG_SPLICE_PAGES path - I don't know whether you count this as a fast
path or not. (Or are you objecting to MSG_SPLICE_PAGES and getting rid of
sendpage in general?)
What I'm trying to do with this aspect is twofold:
Firstly, I'm trying to make it such that the layer above can send each
message in a single sendmsg() if possible. This is possible with sunrpc and
siw, for example, but currently they make a whole bunch of separate calls into
the transport layer - typically at least three for header, body, trailer.
Secondly, I'm trying to avoid a double copy. The layer above TCP/UDP/etc
(sunrpc[*], siw, etc.) needs to glue protocol bits on either end of the
message body and it may have this data in the slab or on the stack - which it
would then need to copy into a page fragment so that it can be zero-copied.
However, if the device can handle this or we don't have sufficient frags, the
network layer may decide to copy it anyway - I'm not sure how the higher layer
can determine this.
It just seems there are fewer places this is required if it can be done in the
network protocol. Note that userspace cannot make use of this since they're
not allowed to set MSG_SPLICE_PAGES.
However, I have kept these bits separate and discard them if it's considered a
bad idea and that MSG_SPLICE_PAGES should, say, give an error in such a case.
David
[*] sunrpc, at least, seems to store the header and trailer in zerocopyable
pages, but has an additional bit on the front that's not.
Powered by blists - more mailing lists