netdev - Re: Linux ECN Handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAK6E8=eN4rYY0-X_7QUDsCc2T-QPcq_1ybEVaKU7TzNL9KKmLA@mail.gmail.com>
Date:   Tue, 21 Nov 2017 07:51:05 -0800
From:   Yuchung Cheng <ycheng@...gle.com>
To:     Neal Cardwell <ncardwell@...gle.com>
Cc:     Steve Ibanez <sibanez@...nford.edu>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>,
        Mohammad Alizadeh <alizadeh@...il.mit.edu>,
        Lawrence Brakmo <brakmo@...com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: Linux ECN Handling

On Tue, Nov 21, 2017 at 7:01 AM, Neal Cardwell <ncardwell@...gle.com> wrote:
>
> On Tue, Nov 21, 2017 at 12:58 AM, Steve Ibanez <sibanez@...nford.edu> wrote:
> > Hi Neal,
> >
> > I tried your suggestion to disable tcp_tso_should_defer() and it does
> > indeed look like it is preventing the host from entering timeouts.
> > I'll have to do a bit more digging to try and find where the packets
> > are being dropped. I've verified that the bottleneck link queue is
> > capacity is at about the configured marking threshold when the timeout
> > occurs, so the drops may be happening at the NIC interfaces or perhaps
> > somewhere unexpected in the switch.
>
> Great! Thanks for running that test.
>
> > I wonder if you can explain why the TLP doesn't fire when in the CWR
> > state? It seems like that might be worth having for cases like this.
>
> The original motivation for only allowing TLP in the CA_Open state was
> to be conservative and avoid having the TLP impose extra load on the
> bottleneck when it may be congested. Plus if there are any SACKed
> packets in the SACK scoreboard then there are other existing
> mechanisms to do speedy loss recovery.
Neal I like your idea of covering more states in TLP. but shouldn't we
also fix the tso_deferral_logic to work better w/ PRR in CWR state, b/c
it's a general transmission issue.


>
> But at various times we have talked about expanding the set of
> scenarios where TLP is used. And I think this example demonstrates
> that there is a class of real-world cases where it probably makes
> sense to allow TLP in the CWR state.
>
> If you have time, would you be able to check if leaving
> tcp_tso_should_defer () as-is but enabling TLP probes in CWR state
> also fixes your performance issue? Perhaps something like
> (uncompiled/untested):
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4ea79b2ad82e..deccf8070f84 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2536,11 +2536,11 @@ bool tcp_schedule_loss_probe(struct sock *sk,
> bool advancing_rto)
>
>         early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans;
>         /* Schedule a loss probe in 2*RTT for SACK capable connections
> -        * in Open state, that are either limited by cwnd or application.
> +        * not in loss recovery, that are either limited by cwnd or application.
>          */
>         if ((early_retrans != 3 && early_retrans != 4) ||
>             !tp->packets_out || !tcp_is_sack(tp) ||
> -           icsk->icsk_ca_state != TCP_CA_Open)
> +           icsk->icsk_ca_state >= TCP_CA_Recovery)
>                 return false;
>
>         if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
>
> > Btw, thank you very much for all the help! It is greatly appreciated :)
>
> You are very welcome! :-)
>
> cheers,
> neal