netdev - Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALMXkpZ4isoXpFp_5=nVUcWrt5TofYVhpdAjv7LkCH7RFW1tYw@mail.gmail.com>
Date:   Mon, 17 Jun 2019 20:19:58 -0700
From:   Christoph Paasch <christoph.paasch@...il.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jonathan Looney <jtl@...flix.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Tyler Hicks <tyhicks@...onical.com>,
        Yuchung Cheng <ycheng@...gle.com>,
        Bruce Curtis <brucec@...flix.com>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Dustin Marquess <dmarquess@...le.com>
Subject: Re: [PATCH net 2/4] tcp: tcp_fragment() should apply sane memory limits

On Mon, Jun 17, 2019 at 7:28 PM Eric Dumazet <eric.dumazet@...il.com> wrote:
>
>
>
> On 6/17/19 5:18 PM, Christoph Paasch wrote:
> >
> > Hi Eric, I now have a packetdrill test that started failing (see
> > below). Admittedly, a bit weird test with the SO_SNDBUF forced so low.
> >
> > Nevertheless, previously this test would pass, now it stalls after the
> > write() because tcp_fragment() returns -ENOMEM. Your commit-message
> > mentions that this could trigger when one sets SO_SNDBUF low. But,
> > here we have a complete stall of the connection and we never recover.
> >
> > I don't know if we care about this, but there it is :-)
>
> I guess it is WAI :)
>
> Honestly I am not sure we want to add code just to allow these degenerated use cases.
>
> Upstream kernels could check if rtx queue is empty or not, but this check will be not trivial to backport
>
>
> Can you test :

Yes, this does the trick for my packetdrill-test.

I wonder, is there a way we could end up in a situation where we can't
retransmit anymore?
For example, sk_wmem_queued has grown so much that the new test fails.
Then, if we legitimately need to fragment in __tcp_retransmit_skb() we
won't be able to do so. So we will never retransmit. And if no ACK
comes back in to make some room we are stuck, no?


Christoph



>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 00c01a01b547ec67c971dc25a74c9258563cf871..06576540133806222f43d4a9532c5a929a2965b0 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1296,7 +1296,8 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue,
>         if (nsize < 0)
>                 nsize = 0;
>
> -       if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
> +       if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
> +                    !tcp_rtx_queue_empty(sk))) {
>                 NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>                 return -ENOMEM;
>         }