netdev - [PATCH V2 net-next] TLP: Don't reschedule PTO when there's one outstanding TLP retransmission

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:   Wed, 26 Jul 2017 17:44:45 +0800
From:   Mao Wenan <maowenan@...wei.com>
To:     <netdev@...r.kernel.org>, <davem@...emloft.net>,
        <ncardwell@...gle.com>, <ycheng@...gle.com>, <nanditad@...gle.com>,
        <weiyongjun1@...wei.com>, <chenweilong@...wei.com>,
        <wangkefeng.wang@...wei.com>
Subject: [PATCH V2 net-next] TLP: Don't reschedule PTO when there's one outstanding TLP retransmission

If there is one TLP probe went out(TLP use the write_queue_tail packet
as TLP probe, we assume this first TLP probe named A), and this TLP
probe was not acked by receive side.

Then the transmit side sent the next two packetes out(named B,C), but
unfortunately these two packets are also not acked by receive side.

And then there is one data packet with ack_seq A arrive
at transmit side, in tcp_ack() will call tcp_schedule_loss_probe()
to rearm PTO, the handler tcp_send_loss_probe() is to check
if(tp->tlp_high_seq) then go to rearm_timer(because there is
one outstanding TLP named A),
so the new TLP probe can't be sent out and it needs to rearm the RTO
timer(timeout is relative to the transmit time of the write queue head).

After that, there is another data packet with ack_seq A is received,
if the tlp_time_stamp is greater than rto_time_stamp, it will reset the
TLP timeout, which is before previous RTO timeout, so PTO is rearm and previous
RTO is cleared. Because there is no retransmission packet was sent or
no TLP sack receive, tp->tlp_high_seq can't be reset to zero and the next
TLP probe also can't be sent out, so there is no way(or very long time)
to retransmit the lost packet.

This fix is to check(tp->tlp_high_seq) in tcp_schedule_loss_probe()
when TLP PTO is after RTO, It is not needed to reschedule PTO when there
is one outstanding TLP retransmission, so if the TLP A is lost RTO can
retransmit lost packet, then tp->tlp_high_seq will be set to 0, and TLP
will go to the normal work process.

v1->v2
	refine some words of code and patch comments.

Signed-off-by: Mao Wenan <maowenan@...wei.com>
---
 net/ipv4/tcp_output.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 886d874..f85c7ef 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2423,6 +2423,12 @@ bool tcp_schedule_loss_probe(struct sock *sk)
 	tlp_time_stamp = tcp_jiffies32 + timeout;
 	rto_time_stamp = (u32)inet_csk(sk)->icsk_timeout;
 	if ((s32)(tlp_time_stamp - rto_time_stamp) > 0) {
+		/* It is not needed to reschedule PTO when there 
+		 * is one outstanding TLP retransmission. 
+		 */
+		if (tp->tlp_high_seq) {
+			return false;
+		}
 		s32 delta = rto_time_stamp - tcp_jiffies32;
 		if (delta > 0)
 			timeout = delta;
-- 
2.5.0