[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa581b7f-8dbc-5b62-dfc9-a21f439ab80c@gmail.com>
Date: Sat, 3 Sep 2022 10:03:56 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Neal Cardwell <ncardwell.kernel@...il.com>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, Neal Cardwell <ncardwell@...gle.com>,
Nagaraj Arankal <nagaraj.p.arankal@....com>,
Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [PATCH net] tcp: fix early ETIMEDOUT after spurious non-SACK RTO
On 9/3/22 05:10, Neal Cardwell wrote:
> From: Neal Cardwell <ncardwell@...gle.com>
>
> Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
> of a spurious non-SACK RTO could cause a connection to fail to clear
> retrans_stamp, causing a later RTO to very prematurely time out the
> connection with ETIMEDOUT.
>
> Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
> report:
>
> (*1) Send one data packet on a non-SACK connection
>
> (*2) Because no ACK packet is received, the packet is retransmitted
> and we enter CA_Loss; but this retransmission is spurious.
>
> (*3) The ACK for the original data is received. The transmitted packet
> is acknowledged. The TCP timestamp is before the retrans_stamp,
> so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
> true without changing state to Open (because tcp_is_sack() is
> false), and tcp_process_loss() returns without calling
> tcp_try_undo_recovery(). Normally after undoing a CA_Loss
> episode, tcp_fastretrans_alert() would see that the connection
> has returned to CA_Open and fall through and call
> tcp_try_to_open(), which would set retrans_stamp to 0. However,
> for non-SACK connections we hold the connection in CA_Loss, so do
> not fall through to call tcp_try_to_open() and do not set
> retrans_stamp to 0. So retrans_stamp is (erroneously) still
> non-zero.
>
> At this point the first "retransmission event" has passed and
> been recovered from. Any future retransmission is a completely
> new "event". However, retrans_stamp is erroneously still
> set. (And we are still in CA_Loss, which is correct.)
>
> (*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
> packet is sent. Note: No data is transmitted between (*3) and
> (*4) and we disabled keep alives.
>
> The socket's timeout SHOULD be calculated from this point in
> time, but instead it's calculated from the prior "event" 16
> minutes ago (step (*2)).
>
> (*5) Because no ACK packet is received, the packet is retransmitted.
>
> (*6) At the time of the 2nd retransmission, the socket returns
> ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
> too far in the past (set at the time of (*2)).
>
> This commit fixes this bug by ensuring that we reuse in
> tcp_try_undo_loss() the same careful logic for non-SACK connections
> that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
> we factor out that logic into a new
> tcp_is_non_sack_preventing_reopen() helper and call that helper from
> both undo functions.
>
> Fixes: da34ac7626b5 ("tcp: only undo on partial ACKs in CA_Loss")
> Reported-by: Nagaraj Arankal <nagaraj.p.arankal@....com>
> Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/
> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Amazing that some folks still do not enable SACK...
Thanks !
Powered by blists - more mailing lists