[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wghhHghtvU_SzXxAzfaX35BkNs-x91-Vj6+6tnVEhPrZg@mail.gmail.com>
Date: Thu, 1 Jun 2023 09:19:18 -0400
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Howells <dhowells@...hat.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Willem de Bruijn <willemdebruijn.kernel@...il.com>,
David Ahern <dsahern@...nel.org>, Matthew Wilcox <willy@...radead.org>, Jens Axboe <axboe@...nel.dk>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Chuck Lever <chuck.lever@...cle.com>, Boris Pismenny <borisp@...dia.com>,
John Fastabend <john.fastabend@...il.com>, Christoph Hellwig <hch@...radead.org>
Subject: Re: Bug in short splice to socket?
On Thu, Jun 1, 2023 at 9:09 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> The reason the old code is garbage is that it sets SPLICE_F_MORE
> entirely in the wrong place. It sets it *after* it has done the
> splice(). That's just crazy.
Clarification, because there are two splice's (from and to): by "after
the splice" I mean after the splice source, of course. It's still set
before the splice _to_ the network.
(But it is still true that I hope the networking code itself then sets
MSG_MORE even if SPLICE_F_MORE wasn't set, if it gets a buffer that is
bigger than what it can handle right now - so there are two
*different* reasons for "more data" - either the source knows it has
more to give, or the destination knows it didn't use everything it
got).
The point is that the splice *source* knows whether there is more data
to be had, and that's where the "there is more" should be set.
But the generic code does *not* know. You add a completely horrendous
hack to kind of approximate that knowledge, but it's *wrong*.
The old code was wrong too, of course. No question about that.
Basically my argument is that the whole "there is more data" should be
set by "->splice_read()" not by some hack in some generic
splice_direct_to_actor() function that is absolutely not supposed to
know about the internals of the source or the destination.
Do we have a good interface for that? No. I get the feeling that to
actually fix this, we'd have to pass in the 'struct splice_desc"
pointer to ->splice_read().
Then the reading side can say "yup, I have more data for you after
this", and it all makes sense at least conceptually.
So please, can we just fix the layering violation, rather than make it worse?
Linus
Powered by blists - more mailing lists