netdev - Re: [PATCH net] tcp: fix early ETIMEDOUT after spurious non-SACK RTO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aa581b7f-8dbc-5b62-dfc9-a21f439ab80c@gmail.com>
Date:   Sat, 3 Sep 2022 10:03:56 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Neal Cardwell <ncardwell.kernel@...il.com>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     netdev@...r.kernel.org, Neal Cardwell <ncardwell@...gle.com>,
        Nagaraj Arankal <nagaraj.p.arankal@....com>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [PATCH net] tcp: fix early ETIMEDOUT after spurious non-SACK RTO


On 9/3/22 05:10, Neal Cardwell wrote:
> From: Neal Cardwell <ncardwell@...gle.com>
>
> Fix a bug reported and analyzed by Nagaraj Arankal, where the handling
> of a spurious non-SACK RTO could cause a connection to fail to clear
> retrans_stamp, causing a later RTO to very prematurely time out the
> connection with ETIMEDOUT.
>
> Here is the buggy scenario, expanding upon Nagaraj Arankal's excellent
> report:
>
> (*1) Send one data packet on a non-SACK connection
>
> (*2) Because no ACK packet is received, the packet is retransmitted
>       and we enter CA_Loss; but this retransmission is spurious.
>
> (*3) The ACK for the original data is received. The transmitted packet
>       is acknowledged.  The TCP timestamp is before the retrans_stamp,
>       so tcp_may_undo() returns true, and tcp_try_undo_loss() returns
>       true without changing state to Open (because tcp_is_sack() is
>       false), and tcp_process_loss() returns without calling
>       tcp_try_undo_recovery().  Normally after undoing a CA_Loss
>       episode, tcp_fastretrans_alert() would see that the connection
>       has returned to CA_Open and fall through and call
>       tcp_try_to_open(), which would set retrans_stamp to 0.  However,
>       for non-SACK connections we hold the connection in CA_Loss, so do
>       not fall through to call tcp_try_to_open() and do not set
>       retrans_stamp to 0. So retrans_stamp is (erroneously) still
>       non-zero.
>
>       At this point the first "retransmission event" has passed and
>       been recovered from. Any future retransmission is a completely
>       new "event". However, retrans_stamp is erroneously still
>       set. (And we are still in CA_Loss, which is correct.)
>
> (*4) After 16 minutes (to correspond with tcp_retries2=15), a new data
>       packet is sent. Note: No data is transmitted between (*3) and
>       (*4) and we disabled keep alives.
>
>       The socket's timeout SHOULD be calculated from this point in
>       time, but instead it's calculated from the prior "event" 16
>       minutes ago (step (*2)).
>
> (*5) Because no ACK packet is received, the packet is retransmitted.
>
> (*6) At the time of the 2nd retransmission, the socket returns
>       ETIMEDOUT, prematurely, because retrans_stamp is (erroneously)
>       too far in the past (set at the time of (*2)).
>
> This commit fixes this bug by ensuring that we reuse in
> tcp_try_undo_loss() the same careful logic for non-SACK connections
> that we have in tcp_try_undo_recovery(). To avoid duplicating logic,
> we factor out that logic into a new
> tcp_is_non_sack_preventing_reopen() helper and call that helper from
> both undo functions.
>
> Fixes: da34ac7626b5 ("tcp: only undo on partial ACKs in CA_Loss")
> Reported-by: Nagaraj Arankal <nagaraj.p.arankal@....com>
> Link: https://lore.kernel.org/all/SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM/
> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Yuchung Cheng <ycheng@...gle.com>


Reviewed-by: Eric Dumazet <edumazet@...gle.com>

Amazing that some folks still do not enable SACK...

Thanks !