netdev - Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAK6E8=eKr9aMVebKWZ=mff8f5Y+VpDX86B2KCKm74HWGJ-md4Q@mail.gmail.com>
Date:   Wed, 27 Jun 2018 09:57:58 -0700
From:   Yuchung Cheng <ycheng@...gle.com>
To:     Neal Cardwell <ncardwell@...gle.com>
Cc:     Lawrence Brakmo <brakmo@...com>,
        Matt Mathis <mattmathis@...gle.com>,
        Netdev <netdev@...r.kernel.org>,
        Kernel Team <kernel-team@...com>,
        Blake Matheny <bmatheny@...com>,
        Alexei Starovoitov <ast@...com>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Wei Wang <weiwan@...gle.com>
Subject: Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction

On Wed, Jun 27, 2018 at 8:24 AM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Tue, Jun 26, 2018 at 10:34 PM Lawrence Brakmo <brakmo@...com> wrote:
>> The only issue is if it is safe to always use 2 or if it is better to
>> use min(2, snd_ssthresh) (which could still trigger the problem).
>
> Always using 2 SGTM. I don't think we need min(2, snd_ssthresh), as
> that should be the same as just 2, since:
>
> (a) RFCs mandate ssthresh should not be below 2, e.g.
> https://tools.ietf.org/html/rfc5681 page 7:
>
>  ssthresh = max (FlightSize / 2, 2*SMSS)            (4)
>
> (b) The main loss-based CCs used in Linux (CUBIC, Reno, DCTCP) respect
> that constraint, and always have an ssthresh of at least 2.
>
> And if some CC misbehaves and uses a lower ssthresh, then taking
> min(2, snd_ssthresh) will trigger problems, as you note.
>
>> +       tp->snd_cwnd = max((int)tcp_packets_in_flight(tp) + sndcnt, 2);
>
> AFAICT this does seem like it will make the sender behavior more
> aggressive in cases with high loss and/or a very low per-flow
> fair-share.
>
> Old:
>
> o send N packets
> o receive SACKs for last 3 packets
> o fast retransmit packet 1
> o using ACKs, slow-start upward
>
> New:
>
> o send N packets
> o receive SACKs for last 3 packets
> o fast retransmit packets 1 and 2
> o using ACKs, slow-start upward
>
> In the extreme case, if the available fair share is less than 2
> packets, whereas inflight would have oscillated between 1 packet and 2
> packets with the existing code, it now seems like with this commit the
> inflight will now hover at 2. It seems like this would have
> significantly higher losses than we had with the existing code.
I share similar concern. Note that this function is used by most
existing congestion control modules beside DCTCP so I am more cautious
of changing this to address DCTCP issue.

One problem that DCTCP paper notices when cwnd = 1 is still too big
when the bottleneck
is shared by many flows (e.g. incast). It specifically suggest
changing the lower-bound of 2 in the spec to 1. (Section 8.2).
https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-judd.pdf

I am curious about the differences you observe in 4.11 and 4.16. I
wasn't aware of any (significant) change in tcp_cwnd_reduction / PRR
algorithm between 4.11 and 4.16. Also the receiver should not delay
ACKs if it has out-of-order packet or receiving CE data packets. This
means the delayed ACK is by tail losses and the last data received
carries no CE mark: seems a less common scenario?

If delayed-ACK is the problem, we probably should fix the receiver to
delay ACK more intelligently, not the sender. weiwan@...gle.com is
working on it.



>
> This may or may not be OK in practice, but IMHO it is worth mentioning
> and discussing.
>
> neal