lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <SJ0PR84MB1847BE6C24D274C46A1B9B0EB27A9@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM> Date: Fri, 2 Sep 2022 10:29:09 +0000 From: "Arankal, Nagaraj" <nagaraj.p.arankal@....com> To: "netdev@...r.kernel.org" <netdev@...r.kernel.org> Subject: retrans_stamp not cleared while testing NewReno implementation. While testing newReno implementation on 4.19.197 based debian kernel, NewReno(SACK disabled) with connections that have a very low traffic, we may timeout the connection too early if a second loss occurs after the first one was successfully acked but no data was transferred later. Below is his description of it: When SACK is disabled, and a socket suffers multiple separate TCP retransmissions, that socket's ETIMEDOUT value is calculated from the time of the *first* retransmission instead of the *latest* retransmission. This happens because the tcp_sock's retrans_stamp is set once then never cleared. Take the following connection: (*1) One data packet sent. (*2) Because no ACK packet is received, the packet is retransmitted. (*3) The ACK packet is received. The transmitted packet is acknowledged. At this point the first "retransmission event" has passed and been recovered from. Any future retransmission is a completely new "event". (*4) After 16 minutes (to correspond with tcp_retries2=15), a new data packet is sent. Note: No data is transmitted between (*3) and (*4) and we disabled keep alives. The socket's timeout SHOULD be calculated from this point in time, but instead it's calculated from the prior "event" 16 minutes ago. (*5) Because no ACK packet is received, the packet is retransmitted. (*6) At the time of the 2nd retransmission, the socket returns ETIMEDOUT. >From the history I came to know that there was a fix included, which would resolve above issue. Please find below patch. static bool tcp_try_undo_recovery(struct sock *sk) * is ACKed. For Reno it is MUST to prevent false * fast retransmits (RFC2582). SACK TCP is safe. */ tcp_moderate_cwnd(tp); + if (!tcp_any_retrans_done(sk)) + tp->retrans_stamp = 0; return true; } However, after introducing following fix, [net,1/2] tcp: only undo on partial ACKs in CA_Loss I am not able to see retrains_stamp reset to Zero. Inside tcp_process_loss , we are returning from below code path. if ((flag & FLAG_SND_UNA_ADVANCED) && tcp_try_undo_loss(sk, false)) return; because of which tp->retrans_stamp is never cleared as we failed to invoke tcp_try_undo_recovery. Is this a known bug in kernel code or is it an expected behavior. - Thanks in advance, Nagaraj
Powered by blists - more mailing lists