netdev - Re: [PATCH net-next 2/2] tcp: ack immediately when a cwr packet arrives

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <F0021757-1A97-47BE-A4CC-3F7B65AD80E5@fb.com>
Date:   Sun, 1 Jul 2018 01:46:38 +0000
From:   Lawrence Brakmo <brakmo@...com>
To:     Neal Cardwell <ncardwell@...gle.com>
CC:     Netdev <netdev@...r.kernel.org>, Kernel Team <Kernel-team@...com>,
        "Blake Matheny" <bmatheny@...com>, Alexei Starovoitov <ast@...com>,
        Yuchung Cheng <ycheng@...gle.com>,
        Steve Ibanez <sibanez@...nford.edu>,
        Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH net-next 2/2] tcp: ack immediately when a cwr packet
 arrives

On 6/30/18, 11:23 AM, "Neal Cardwell" <ncardwell@...gle.com> wrote:

    On Fri, Jun 29, 2018 at 9:48 PM Lawrence Brakmo <brakmo@...com> wrote:
    >
    > We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The
    > problem is triggered when the last packet of a request arrives CE
    > marked. The reply will carry the ECE mark causing TCP to shrink its cwnd
    > to 1 (because there are no packets in flight). When the 1st packet of
    > the next request arrives it was sometimes delayed adding up to 40ms to
    > the latency.
    >
    > This patch insures that CWR makred data packets arriving will be acked
    > immediately.
    >
    > Signed-off-by: Lawrence Brakmo <brakmo@...com>
    > ---
    
    Thanks, Larry. Ensuring that CWR-marked data packets arriving will be
    acked immediately sounds like a good goal to me.
    
    I am wondering if we can have a simpler fix.
    
    The dctcp_ce_state_0_to_1() code is setting the TCP_ECN_DEMAND_CWR
    bit in ecn_flags, which disables the code in __tcp_ecn_check_ce() that
    would have otherwise used the tcp_enter_quickack_mode() mechanism to
    force an ACK:
    
    __tcp_ecn_check_ce():
    ...
    case INET_ECN_CE:
      if (tcp_ca_needs_ecn(sk))
        tcp_ca_event(sk, CA_EVENT_ECN_IS_CE);
           // -> dctcp_ce_state_0_to_1()
           //     ->  tp->ecn_flags |= TCP_ECN_DEMAND_CWR;
    
      if (!(tp->ecn_flags & TCP_ECN_DEMAND_CWR)) {
        /* Better not delay acks, sender can have a very low cwnd */
        tcp_enter_quickack_mode(sk, 1);
        tp->ecn_flags |= TCP_ECN_DEMAND_CWR;
      }
      tp->ecn_flags |= TCP_ECN_SEEN;
      break;
    
    So it seems like the bug here may be that dctcp_ce_state_0_to_1()  is
    setting the TCP_ECN_DEMAND_CWR  bit in ecn_flags, when really it
    should let its caller, __tcp_ecn_check_ce() set TCP_ECN_DEMAND_CWR, in
    which case the code would already properly force an immediate ACK.
    
    So, what if we use this fix instead (not compiled, not tested):
    
    diff --git a/net/ipv4/tcp_dctcp.c b/net/ipv4/tcp_dctcp.c
    index 5f5e5936760e..4fecd2824edb 100644
    --- a/net/ipv4/tcp_dctcp.c
    +++ b/net/ipv4/tcp_dctcp.c
    @@ -152,8 +152,6 @@ static void dctcp_ce_state_0_to_1(struct sock *sk)
    
            ca->prior_rcv_nxt = tp->rcv_nxt;
            ca->ce_state = 1;
    -
    -       tp->ecn_flags |= TCP_ECN_DEMAND_CWR;
     }
    
     static void dctcp_ce_state_1_to_0(struct sock *sk)
    
    What do you think?
    
    neal

I see two issues, one is that entering quickack mode as you mentioned does not insure that it will still be on when the CWR arrives. The second issue is that the problem occurs right after the receiver sends a small reply which results in entering pingpong mode right before the sender starts the new request by sending just one packet (i.e. forces delayed ack).

I compiled and tested your patch. Both 99 and 99.9 percentile latencies are around 40ms. Looking at the packet traces shows that some CWR marked packets are not being ack immediately (delayed by 40ms).

Larry