netdev - Re: Bug in short splice to socket?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wghhHghtvU_SzXxAzfaX35BkNs-x91-Vj6+6tnVEhPrZg@mail.gmail.com>
Date: Thu, 1 Jun 2023 09:19:18 -0400
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Howells <dhowells@...hat.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, Willem de Bruijn <willemdebruijn.kernel@...il.com>, 
	David Ahern <dsahern@...nel.org>, Matthew Wilcox <willy@...radead.org>, Jens Axboe <axboe@...nel.dk>, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	Chuck Lever <chuck.lever@...cle.com>, Boris Pismenny <borisp@...dia.com>, 
	John Fastabend <john.fastabend@...il.com>, Christoph Hellwig <hch@...radead.org>
Subject: Re: Bug in short splice to socket?

On Thu, Jun 1, 2023 at 9:09 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> The reason the old code is garbage is that it sets SPLICE_F_MORE
> entirely in the wrong place. It sets it *after* it has done the
> splice(). That's just crazy.

Clarification, because there are two splice's (from and to): by "after
the splice" I mean after the splice source, of course. It's still set
before the splice _to_ the network.

(But it is still true that I hope the networking code itself then sets
MSG_MORE even if SPLICE_F_MORE wasn't set, if it gets a buffer that is
bigger than what it can handle right now - so there are two
*different* reasons for "more data" - either the source knows it has
more to give, or the destination knows it didn't use everything it
got).

The point is that the splice *source* knows whether there is more data
to be had, and that's where the "there is more" should be set.

But the generic code does *not* know. You add a completely horrendous
hack to kind of approximate that knowledge, but it's *wrong*.

The old code was wrong too, of course. No question about that.

Basically my argument is that the whole "there is more data" should be
set by "->splice_read()" not by some hack in some generic
splice_direct_to_actor() function that is absolutely not supposed to
know about the internals of the source or the destination.

Do we have a good interface for that? No. I get the feeling that to
actually fix this, we'd have to pass in the 'struct splice_desc"
pointer to ->splice_read().

Then the reading side can say "yup, I have more data for you after
this", and it all makes sense at least conceptually.

So please, can we just fix the layering violation, rather than make it worse?

                Linus