[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20200223.172528.2144668063707204291.davem@davemloft.net>
Date: Sun, 23 Feb 2020 17:25:28 -0800 (PST)
From: David Miller <davem@...emloft.net>
To: ncardwell@...gle.com
Cc: netdev@...r.kernel.org, ycheng@...gle.com, edumazet@...gle.com
Subject: Re: [PATCH net] tcp: fix TFO SYNACK undo to avoid
double-timestamp-undo
From: Neal Cardwell <ncardwell@...gle.com>
Date: Sat, 22 Feb 2020 11:21:15 -0500
> In a rare corner case the new logic for undo of SYNACK RTO could
> result in triggering the warning in tcp_fastretrans_alert() that says:
> WARN_ON(tp->retrans_out != 0);
>
> The warning looked like:
>
> WARNING: CPU: 1 PID: 1 at net/ipv4/tcp_input.c:2818 tcp_ack+0x13e0/0x3270
>
> The sequence that tickles this bug is:
> - Fast Open server receives TFO SYN with data, sends SYNACK
> - (client receives SYNACK and sends ACK, but ACK is lost)
> - server app sends some data packets
> - (N of the first data packets are lost)
> - server receives client ACK that has a TS ECR matching first SYNACK,
> and also SACKs suggesting the first N data packets were lost
> - server performs TS undo of SYNACK RTO, then immediately
> enters recovery
> - buggy behavior then performed a *second* undo that caused
> the connection to be in CA_Open with retrans_out != 0
>
> Basically, the incoming ACK packet with SACK blocks causes us to first
> undo the cwnd reduction from the SYNACK RTO, but then immediately
> enters fast recovery, which then makes us eligible for undo again. And
> then tcp_rcv_synrecv_state_fastopen() accidentally performs an undo
> using a "mash-up" of state from two different loss recovery phases: it
> uses the timestamp info from the ACK of the original SYNACK, and the
> undo_marker from the fast recovery.
>
> This fix refines the logic to only invoke the tcp_try_undo_loss()
> inside tcp_rcv_synrecv_state_fastopen() if the connection is still in
> CA_Loss. If peer SACKs triggered fast recovery, then
> tcp_rcv_synrecv_state_fastopen() can't safely undo.
>
> Fixes: 794200d66273 ("tcp: undo cwnd on Fast Open spurious SYNACK retransmit")
> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Applied and queued up for -stable.
Powered by blists - more mailing lists