[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2573495.1754550778@warthog.procyon.org.uk>
Date: Thu, 07 Aug 2025 08:12:58 +0100
From: David Howells <dhowells@...hat.com>
To: Stefan Metzmacher <metze@...ba.org>
Cc: dhowells@...hat.com, Steve French <sfrench@...ba.org>,
Paulo Alcantara <pc@...guebit.org>,
Shyam Prasad N <sprasad@...rosoft.com>, Tom Talpey <tom@...pey.com>,
Wang Zhaolong <wangzhaolong@...weicloud.com>,
Mina Almasry <almasrymina@...gle.com>, linux-cifs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 00/31] netfs: [WIP] Allow the use of MSG_SPLICE_PAGES and use netmem allocator
Stefan Metzmacher <metze@...ba.org> wrote:
> >> So the current situation is that we memcpy (at least) in sendmsg()
> >> and with your patches we do a memcpy higher in the stack, but then
> >> use MSG_SPLICE_PAGES in order to do it twice. Is that correct?
> > Not twice, no. MSG_SPLICE_PAGES allows sendmsg() to splice the supplied
> > pages
> > into the sk_buffs directly, thereby avoiding a copy in the TCP layer and
> > cutting out the feeder loop in cifs.
>
> Yes, and we must be careful to not touch the pages after
> calling sendmsg(MSG_SPLICE_PAGES).
Until we get a response from the server, yes, but for the protocol info that
shouldn't be an issue. And if we're going to encrypt, we'll have to do a copy
anyway for something like Write, but we can get the encryption algo to do that
for us by giving it a separate destination buffer.
> And unlike MSG_ZEROCOPY tcp_sendmsg_locked() has no
> no struct ubuf_info *uarg when MSG_SPLICE_PAGES is used
> and there's no way to know when the pages are no longer
> used by the tcp stack.
Correct (and this is something we'll need to address), but for the moment we
can rely on page refcounts. MSG_SPLICE_PAGES takes a ref on each page - which
is why you can't use it with slab memory. However, if we pass in
netmem-allocated memory, that works by refcounting, so that should work.
> Can you explain how/where we allocate the memory and where
> we unreference it in the caller of sendmsg(MSG_SPLICE_PAGES).
Currently, we allocate the buffer in fs/netfs/buffer.c in
netfs_alloc_bvecq_buffer(). That just bulk allocates a bunch of pages and
adds them into a bvecq. As they're untyped pages, we can use the refcount. I
want to allocate netmem instead, but I haven't done that yet.
We then call sendmsg(MSG_SPLICE_PAGES) and then drop our ref on the pages.
TCP will have taken its own ref which it will drop in due course when the
skbuffs are cleaned up.
David
Powered by blists - more mailing lists