[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAK6E8=d+hyMqpXa7ViBUBLATT+8iK89Wbgd=N5Ff-QbOxu-M5w@mail.gmail.com>
Date: Wed, 27 Jun 2018 15:27:35 -0700
From: Yuchung Cheng <ycheng@...gle.com>
To: Lawrence Brakmo <brakmo@...com>
Cc: Neal Cardwell <ncardwell@...gle.com>,
Matt Mathis <mattmathis@...gle.com>,
Netdev <netdev@...r.kernel.org>,
Kernel Team <Kernel-team@...com>,
Blake Matheny <bmatheny@...com>,
Alexei Starovoitov <ast@...com>,
Eric Dumazet <eric.dumazet@...il.com>,
Wei Wang <weiwan@...gle.com>
Subject: Re: [PATCH net-next v2] tcp: force cwnd at least 2 in tcp_cwnd_reduction
On Wed, Jun 27, 2018 at 1:00 PM, Lawrence Brakmo <brakmo@...com> wrote:
>
>
> From: <netdev-owner@...r.kernel.org> on behalf of Yuchung Cheng
> <ycheng@...gle.com>
> Date: Wednesday, June 27, 2018 at 9:59 AM
> To: Neal Cardwell <ncardwell@...gle.com>
> Cc: Lawrence Brakmo <brakmo@...com>, Matt Mathis <mattmathis@...gle.com>,
> Netdev <netdev@...r.kernel.org>, Kernel Team <Kernel-team@...com>, Blake
> Matheny <bmatheny@...com>, Alexei Starovoitov <ast@...com>, Eric Dumazet
> <eric.dumazet@...il.com>, Wei Wang <weiwan@...gle.com>
> Subject: Re: [PATCH net-next v2] tcp: force cwnd at least 2 in
> tcp_cwnd_reduction
>
>
>
> On Wed, Jun 27, 2018 at 8:24 AM, Neal Cardwell <ncardwell@...gle.com> wrote:
>
> On Tue, Jun 26, 2018 at 10:34 PM Lawrence Brakmo <brakmo@...com> wrote:
>
> The only issue is if it is safe to always use 2 or if it is better to
>
> use min(2, snd_ssthresh) (which could still trigger the problem).
>
>
>
> Always using 2 SGTM. I don't think we need min(2, snd_ssthresh), as
>
> that should be the same as just 2, since:
>
>
>
> (a) RFCs mandate ssthresh should not be below 2, e.g.
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_rfc5681&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=pq_Mqvzfy-C8ltkgyx1u_g&m=PZQC-6NGqK6QEVrf1WhlAD3mQt7tK8aqrfQGp93dJy4&s=WH9fXpDBHC31NCixiiMvhOb2UXCuJIxUiY4IXyJTgpc&e=
> page 7:
>
>
>
> ssthresh = max (FlightSize / 2, 2*SMSS) (4)
>
>
>
> (b) The main loss-based CCs used in Linux (CUBIC, Reno, DCTCP) respect
>
> that constraint, and always have an ssthresh of at least 2.
>
>
>
> And if some CC misbehaves and uses a lower ssthresh, then taking
>
> min(2, snd_ssthresh) will trigger problems, as you note.
>
>
>
> + tp->snd_cwnd = max((int)tcp_packets_in_flight(tp) + sndcnt, 2);
>
>
>
> AFAICT this does seem like it will make the sender behavior more
>
> aggressive in cases with high loss and/or a very low per-flow
>
> fair-share.
>
>
>
> Old:
>
>
>
> o send N packets
>
> o receive SACKs for last 3 packets
>
> o fast retransmit packet 1
>
> o using ACKs, slow-start upward
>
>
>
> New:
>
>
>
> o send N packets
>
> o receive SACKs for last 3 packets
>
> o fast retransmit packets 1 and 2
>
> o using ACKs, slow-start upward
>
>
>
> In the extreme case, if the available fair share is less than 2
>
> packets, whereas inflight would have oscillated between 1 packet and 2
>
> packets with the existing code, it now seems like with this commit the
>
> inflight will now hover at 2. It seems like this would have
>
> significantly higher losses than we had with the existing code.
>
> I share similar concern. Note that this function is used by most
>
> existing congestion control modules beside DCTCP so I am more cautious
>
> of changing this to address DCTCP issue.
>
>
>
> Theoretically it could happen with any CC, but I have only seen it
> consistently with DCTCP.
>
> One option, as I mentioned to Neal, is to use 2 only when cwnd reduction is
> caused by
>
> ECN signal.
>
>
>
> One problem that DCTCP paper notices when cwnd = 1 is still too big
>
> when the bottleneck
>
> is shared by many flows (e.g. incast). It specifically suggest
>
> changing the lower-bound of 2 in the spec to 1. (Section 8.2).
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.usenix.org_system_files_conference_nsdi15_nsdi15-2Dpaper-2Djudd.pdf&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=pq_Mqvzfy-C8ltkgyx1u_g&m=PZQC-6NGqK6QEVrf1WhlAD3mQt7tK8aqrfQGp93dJy4&s=fO8p5khks-sGO_lwEvKicWtFz_q9LI-ecwZcS1nyZ7w&e=
>
>
>
> One could even envision using even lower values than 1, where we send one
> packet
>
> every 2 RTTs. However, I do not think this is a problem that needs to be
> fixed.
>
>
>
> I am curious about the differences you observe in 4.11 and 4.16. I
>
> wasn't aware of any (significant) change in tcp_cwnd_reduction / PRR
>
> algorithm between 4.11 and 4.16. Also the receiver should not delay
>
> ACKs if it has out-of-order packet or receiving CE data packets. This
>
> means the delayed ACK is by tail losses and the last data received
>
> carries no CE mark: seems a less common scenario?
>
>
>
> I have the feeling the differences between 4.11 and 4.16 are due to
> accounting
>
> changes and not change in CC behaviors.
If the difference is not due to change in PRR, I don't think we
should change PRR to address that until we have some evidence.
Could you share some traces?
>
>
>
> The packet and ACK are not lost. I captured pcaps on both hosts and could
> see
>
> cases when the data packet arrived, but the ACK was not sent before the
> packet was
>
> retransmitted. I saw this behavior multiple times. In many cases the ACK did
> not show
>
> up in the pcap within the 40ms one would usually expect with a delayed ACK.
>
> If delayed-ACK is the problem, we probably should fix the receiver to
>
> delay ACK more intelligently, not the sender. weiwan@...gle.com is
>
> working on it.
>
>
>
> Not sure if it is delayed-ACK or something caused the ACK to not be sent
> since it did
>
> not show up within 40ms as one would usually expect for a delayed ACK.
>
>
>
>
>
>
>
> This may or may not be OK in practice, but IMHO it is worth mentioning
>
> and discussing.
>
>
>
> neal
>
>
Powered by blists - more mailing lists