lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wji_2UwFMkUYkygsYRek05NwaQkH-vA=yKQtQS9Js+urQ@mail.gmail.com>
Date:   Thu, 1 Jun 2023 11:12:17 -0400
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     David Howells <dhowells@...hat.com>
Cc:     Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Paolo Abeni <pabeni@...hat.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        David Ahern <dsahern@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Jens Axboe <axboe@...nel.dk>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Chuck Lever <chuck.lever@...cle.com>,
        Boris Pismenny <borisp@...dia.com>,
        John Fastabend <john.fastabend@...il.com>,
        Christoph Hellwig <hch@...radead.org>
Subject: Re: Bug in short splice to socket?

On Thu, Jun 1, 2023 at 10:34 AM David Howells <dhowells@...hat.com> wrote:
>
> At the moment, it transcribes 16 pages at a time.  I could make it set
> MSG_MORE only if (a) SPLICE_F_MORE was passed into the splice() syscall or (b)
> there's yet more data in the buffer.

That would at least be a good first step.

> However, this might well cause a malfunction in UDP, for example.  MSG_MORE
> corks the current packet, so if I ask sendfile() say shove 32K into a packet,
> if, say, 16K is read from the source and entirely transcribed into the packet,

If you use splice() for UDP, I don't think you would normally expect
to get all that well-defined packet boundaries.

That said, I think *this* part of splice_direct_to_actor() is correct:

                if (read_len < len)
                        sd->flags |= SPLICE_F_MORE;    <- WRONG
                else if (!more)
                        sd->flags &= ~SPLICE_F_MORE; <- CORRECT

ie if we've used up all of the 'len' argument, *and* 'more' wasn't set
in the incoming flags, then at that point we should clear
SPLICE_F_MORE.

So that means that UDP packets boundaries will be honored at the
'splice()' system call 'len' argument.

Obviously packet boundaries might happen before that - ie depending on
what the packet size limits are.

But the "set SPLICE_F_MORE" bit is just wrong. The generic code simply
does not know enough to make that determination.

> if I understand what you're proposing, MSG_MORE wouldn't get set and the
> packet would be transmitted early.

No, I'm saying that MSG_MORE should be set depending on what the
splice *input* says.

If the splice input knows that it has more to give but stopped early
for whatever reason (typically that the splice pipe buffers filled up,
but that's not necessarily the *only* reason), then it should set
SPLICE_F_MORE.

But this is literally only something that the input side *can* know.

And as you mention, some input sides cannot know even that. Regular
files typically know if there is more data. Other dynamic sources may
simply not know. And if they know, they just shouldn't set
SPLICE_F_MORE.

Of course, SPLICE_F_MORE may then be set because the *user* passed in
that flag, but that's a completely separate issue. The user may pass
in that flag because the user wants maximally sized packets, and knows
that other things will be fed into the destination (not even
necessarily through splice) after the splice.

So you really have multiple different reasons why SPLICE_F_MORE might
get set, but that

                if (read_len < len)

is *not* a valid reason. And no, extending that logic with more random
logic is still not a valid reason.

            Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ