netdev - Re: [PATCH net-next] tcp: Optimize the recovery of tcp when lack of SACK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADVnQyksf4Nt2hHsWaAs3wLOK+rDp79ph5TZywMqfEAPOVgzww@mail.gmail.com>
Date:   Fri, 17 Jul 2020 10:44:47 -0400
From:   Neal Cardwell <ncardwell@...gle.com>
To:     hujunwei <hujunwei4@...wei.com>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        David Miller <davem@...emloft.net>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Netdev <netdev@...r.kernel.org>, wangxiaogang3@...wei.com,
        jinyiting@...wei.com, xuhanbing@...wei.com, zhengshaoyu@...wei.com,
        Yuchung Cheng <ycheng@...gle.com>,
        Ilpo Jarvinen <ilpo.jarvinen@...helsinki.fi>
Subject: Re: [PATCH net-next] tcp: Optimize the recovery of tcp when lack of SACK

On Fri, Jul 17, 2020 at 7:43 AM hujunwei <hujunwei4@...wei.com> wrote:
>
> From: Junwei Hu <hujunwei4@...wei.com>
>
> In the document of RFC2582(https://tools.ietf.org/html/rfc2582)
> introduced two separate scenarios for tcp congestion control:
> There are two separate scenarios in which the TCP sender could
> receive three duplicate acknowledgements acknowledging "send_high"
> but no more than "send_high".  One scenario would be that the data
> sender transmitted four packets with sequence numbers higher than
> "send_high", that the first packet was dropped in the network, and
> the following three packets triggered three duplicate
> acknowledgements acknowledging "send_high".  The second scenario
> would be that the sender unnecessarily retransmitted three packets
> below "send_high", and that these three packets triggered three
> duplicate acknowledgements acknowledging "send_high".  In the absence
> of SACK, the TCP sender in unable to distinguish between these two
> scenarios.
>
> We encountered the second scenario when the third-party switches
> does not support SACK, and I use kprobes to find that tcp kept in
> CA_Loss state when high_seq equal to snd_nxt.
>
> All of the packets is acked if high_seq equal to snd_nxt, the TCP
> sender is able to distinguish between these two scenarios in
> described RFC2582. So the current state can be switched.

Can you please elaborate on how the sender is able to distinguish
between the two scenarios, after your patch?

It seems to me that with this proposed patch, there is the risk of
spurious fast recoveries due to 3 dupacks in the second second
scenario (the sender unnecessarily retransmitted three packets below
"send_high"). Can you please post a packetdrill test to demonstrate
that with this patch the TCP sender does not spuriously enter fast
recovery in such a scenario?

> This patch enhance the TCP congestion control algorithm for lack
> of SACK.

You describe this as an enhancement. Can you please elaborate on the
drawback/downside of staying in CA_Loss in this case you are
describing (where you used kprobes to find that TCP stayed in CA_Loss
state when high_seq was equal to snd_nxt)?

> Signed-off-by: Junwei Hu <hujunwei4@...wei.com>
> Reviewed-by: XiaoGang Wang <wangxiaogang3@...wei.com>
> Reviewed-by: Yiting Jin <jinyiting@...wei.com>
> ---
>  net/ipv4/tcp_input.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 9615e72..d5573123 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2385,7 +2385,8 @@ static bool tcp_try_undo_recovery(struct sock *sk)
>         } else if (tp->rack.reo_wnd_persist) {
>                 tp->rack.reo_wnd_persist--;
>         }
> -       if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
> +       if (tp->snd_una == tp->high_seq &&
> +           tcp_is_reno(tp) && tp->snd_nxt > tp->high_seq) {

To deal with sequence number wrap-around, sequence number comparisons
in TCP need to use the before() and after() helpers, rather than
comparison operators. Here it seems the patch should use after()
rather than >. However,  I think the larger concern is the concern
mentioned above.

best,
neal