netdev - Re: [PATCH net] tcp: fix TFO SYNACK undo to avoid double-timestamp-undo

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20200223.172528.2144668063707204291.davem@davemloft.net>
Date:   Sun, 23 Feb 2020 17:25:28 -0800 (PST)
From:   David Miller <davem@...emloft.net>
To:     ncardwell@...gle.com
Cc:     netdev@...r.kernel.org, ycheng@...gle.com, edumazet@...gle.com
Subject: Re: [PATCH net] tcp: fix TFO SYNACK undo to avoid
 double-timestamp-undo

From: Neal Cardwell <ncardwell@...gle.com>
Date: Sat, 22 Feb 2020 11:21:15 -0500

> In a rare corner case the new logic for undo of SYNACK RTO could
> result in triggering the warning in tcp_fastretrans_alert() that says:
>         WARN_ON(tp->retrans_out != 0);
> 
> The warning looked like:
> 
> WARNING: CPU: 1 PID: 1 at net/ipv4/tcp_input.c:2818 tcp_ack+0x13e0/0x3270
> 
> The sequence that tickles this bug is:
>  - Fast Open server receives TFO SYN with data, sends SYNACK
>  - (client receives SYNACK and sends ACK, but ACK is lost)
>  - server app sends some data packets
>  - (N of the first data packets are lost)
>  - server receives client ACK that has a TS ECR matching first SYNACK,
>    and also SACKs suggesting the first N data packets were lost
>     - server performs TS undo of SYNACK RTO, then immediately
>       enters recovery
>     - buggy behavior then performed a *second* undo that caused
>       the connection to be in CA_Open with retrans_out != 0
> 
> Basically, the incoming ACK packet with SACK blocks causes us to first
> undo the cwnd reduction from the SYNACK RTO, but then immediately
> enters fast recovery, which then makes us eligible for undo again. And
> then tcp_rcv_synrecv_state_fastopen() accidentally performs an undo
> using a "mash-up" of state from two different loss recovery phases: it
> uses the timestamp info from the ACK of the original SYNACK, and the
> undo_marker from the fast recovery.
> 
> This fix refines the logic to only invoke the tcp_try_undo_loss()
> inside tcp_rcv_synrecv_state_fastopen() if the connection is still in
> CA_Loss.  If peer SACKs triggered fast recovery, then
> tcp_rcv_synrecv_state_fastopen() can't safely undo.
> 
> Fixes: 794200d66273 ("tcp: undo cwnd on Fast Open spurious SYNACK retransmit")
> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>

Applied and queued up for -stable.