[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK6E8=cBPjWafA0fvrzwh84Jz4xE47CRjK0LvUimD5Ui590Nng@mail.gmail.com>
Date: Mon, 9 Jul 2018 09:31:08 -0700
From: Yuchung Cheng <ycheng@...gle.com>
To: Neal Cardwell <ncardwell@...gle.com>
Cc: David Miller <davem@...emloft.net>,
Lawrence Brakmo <brakmo@...com>,
Netdev <netdev@...r.kernel.org>,
Kernel Team <kernel-team@...com>,
Blake Matheny <bmatheny@...com>,
Alexei Starovoitov <ast@...com>,
Steve Ibanez <sibanez@...nford.edu>,
Eric Dumazet <eric.dumazet@...il.com>,
Yousuk Seung <ysseung@...gle.com>
Subject: Re: [PATCH net-next v3 0/2] tcp: fix high tail latencies in DCTCP
On Sat, Jul 7, 2018 at 7:07 AM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Sat, Jul 7, 2018 at 7:15 AM David Miller <davem@...emloft.net> wrote:
>>
>> From: Lawrence Brakmo <brakmo@...com>
>> Date: Tue, 3 Jul 2018 09:26:13 -0700
>>
>> > When have observed high tail latencies when using DCTCP for RPCs as
>> > compared to using Cubic. For example, in one setup there are 2 hosts
>> > sending to a 3rd one, with each sender having 3 flows (1 stream,
>> > 1 1MB back-to-back RPCs and 1 10KB back-to-back RPCs). The following
>> > table shows the 99% and 99.9% latencies for both Cubic and dctcp:
>> >
>> > Cubic 99% Cubic 99.9% dctcp 99% dctcp 99.9%
>> > 1MB RPCs 2.6ms 5.5ms 43ms 208ms
>> > 10KB RPCs 1.1ms 1.3ms 53ms 212ms
>> ...
>> > v2: Removed call to tcp_ca_event from tcp_send_ack since I added one in
>> > tcp_event_ack_sent. Based on Neal Cardwell <ncardwell@...gle.com>
>> > feedback.
>> > Modified tcp_ecn_check_ce (and renamed it tcp_ecn_check) instead of modifying
>> > tcp_ack_send_check to insure an ACK when cwr is received.
>> > v3: Handling cwr in tcp_ecn_accept_cwr instead of in tcp_ecn_check.
>> >
>> > [PATCH net-next v3 1/2] tcp: notify when a delayed ack is sent
>> > [PATCH net-next v3 2/2] tcp: ack immediately when a cwr packet
>>
>> Neal and co., what are your thoughts right now about this patch series?
>>
>> Thank you.
>
> IMHO these patches are a definite improvement over what we have now.
>
> That said, in chatting with Yuchung before the July 4th break, I think
> Yuchung and I agreed that we would ideally like to see something like
> the following:
>
> (1) refactor the DCTCP code to check for pending delayed ACKs directly
> using existing state (inet_csk(sk)->icsk_ack.pending &
> ICSK_ACK_TIMER), and remove the ca->delayed_ack_reserved DCTCP field
> and the CA_EVENT_DELAYED_ACK and CA_EVENT_NON_DELAYED_ACK callbacks
> added for DCTCP (which Larry determined had at least one bug).
>
> (2) fix the bug with the DCTCP call to tcp_send_ack(sk) causing
> delayed ACKs to be incorrectly dropped/forgotten (not yet addressed by
> this patch series)
>
> (3) then with fixes (1) and (2) in place, re-run tests and see if we
> still need Larry's heuristic (in patch 2) to fire an ACK immediately
> if a receiver receives a CWR packet (I suspect this is still very
> useful, but I think Yuchung is reluctant to add this complexity unless
> we have verified it's still needed after (1) and (2))
>
> Our team may be able to help out with some proposed patches for (1) and (2).
>
> In any case, I would love to have Yuchung and Eric weigh in (perhaps
> Monday) before we merge this patch series.
Thanks Neal. Sorry for not reflecting these timely before I took off
for July 4 holidays. I was going to post the same comment - Larry: I
could provide draft patches if that helps.
>
> Thanks,
> neal
Powered by blists - more mailing lists