linux-kernel - Re: [PATCH net-next 3/3] net: tcp: handle window shrink properly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADxym3a0gmzmD3Vwu_shoJnAHm-xjD5tJRuKwTvAXnVk_H55AA@mail.gmail.com>
Date:   Thu, 18 May 2023 10:34:59 +0800
From:   Menglong Dong <menglong8.dong@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     kuba@...nel.org, davem@...emloft.net, pabeni@...hat.com,
        dsahern@...nel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Menglong Dong <imagedong@...cent.com>
Subject: Re: [PATCH net-next 3/3] net: tcp: handle window shrink properly

On Wed, May 17, 2023 at 10:47 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Wed, May 17, 2023 at 2:42 PM <menglong8.dong@...il.com> wrote:
> >
> > From: Menglong Dong <imagedong@...cent.com>
> >
> > Window shrink is not allowed and also not handled for now, but it's
> > needed in some case.
> >
> > In the origin logic, 0 probe is triggered only when there is no any
> > data in the retrans queue and the receive window can't hold the data
> > of the 1th packet in the send queue.
> >
> > Now, let's change it and trigger the 0 probe in such cases:
> >
> > - if the retrans queue has data and the 1th packet in it is not within
> > the receive window
> > - no data in the retrans queue and the 1th packet in the send queue is
> > out of the end of the receive window
>
> Sorry, I do not understand.
>
> Please provide packetdrill tests for new behavior like that.
>

Yes. The problem can be reproduced easily.

1. choose a server machine, decrease it's tcp_mem with:
    echo '1024 1500 2048' > /proc/sys/net/ipv4/tcp_mem
2. call listen() and accept() on a port, such as 8888. We call
    accept() looply and without call recv() to make the data stay
    in the receive queue.
3. choose a client machine, and create 100 TCP connection
    to the 8888 port of the server. Then, every connection sends
    data about 1M.
4. we can see that some of the connection enter the 0-probe
    state, but some of them keep retrans again and again. As
    the server is up to the tcp_mem[2] and skb is dropped before
    the recv_buf full and the connection enter 0-probe state.
    Finially, some of these connection will timeout and break.

With this series, all the 100 connections will enter 0-probe
status and connection break won't happen. And the data
trans will recover if we increase tcp_mem or call 'recv()'
on the sockets in the server.

> Also, such fundamental change would need IETF discussion first.
> We do not want linux to cause network collapses just because billions
> of devices send more zero probes.

I think it maybe a good idea to make the connection enter
0-probe, rather than drop the skb silently. What 0-probe
meaning is to wait for space available when the buffer of the
receive queue is full. And maybe we can also use 0-probe
when the "buffer" of "TCP protocol" (which means tcp_mem)
is full?

Am I right?

Thanks!
Menglong Dong