[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <000201d5c140$f5dd0cb0$e1972610$@wangsu.com>
Date: Thu, 2 Jan 2020 15:48:01 +0800
From: 杨鹏程 <yangpc@...gsu.com>
To: "'Eric Dumazet'" <edumazet@...gle.com>
Cc: "'David Miller'" <davem@...emloft.net>,
"'Alexey Kuznetsov'" <kuznet@....inr.ac.ru>,
"'Hideaki YOSHIFUJI'" <yoshfuji@...ux-ipv6.org>,
"'Alexei Starovoitov'" <ast@...nel.org>,
"'Daniel Borkmann'" <daniel@...earbox.net>,
"'Martin KaFai Lau'" <kafai@...com>,
"'Song Liu'" <songliubraving@...com>,
"'Yonghong Song'" <yhs@...com>, <andriin@...com>,
"'netdev'" <netdev@...r.kernel.org>,
"'LKML'" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK
Hi Eric Dumazet,
I'm sorry there was a slight error in the packetdrill test case of the previous email reply,
the ACK segment should not carry data, although this does not affect the description of the situation.
I fixed the packetdrill test and resent it as follows:
packetdrill test case:
// Verify the "old stuff" D-SACK causing SACK to be treated as D-SACK
--tolerance_usecs=10000
// enable RACK and TLP
0 `sysctl -q net.ipv4.tcp_recovery=1; sysctl -q net.ipv4.tcp_early_retrans=3`
// Establish a connection, rtt = 10ms
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <...>
+.01 < . 1:1(0) ack 1 win 320
+0 accept(3, ..., ...) = 4
// send 10 data segments
+0 write(4, ..., 10000) = 10000
+0 > P. 1:10001(10000) ack 1
// send TLP
+.02 > P. 9001:10001(1000) ack 1
// enter recovery and retransmit 1:1001, now undo_marker = 1
+.015 < . 1:1(0) ack 1 win 320 <sack 9001:10001, nop, nop>
+0 > . 1:1001(1000) ack 1
// ack 1:1001 and retransmit 1001:3001
+.01 < . 1:1(0) ack 1001 win 320 <sack 9001:10001, nop, nop>
+0 > . 1001:3001(2000) ack 1
// sack 2001:3001, now 2001:3001 has R|S
+.01 < . 1:1(0) ack 1001 win 320 <sack 2001:3001 9001:10001, nop, nop>
+0 %{ assert tcpi_reordering == 3, tcpi_reordering }%
// d-sack 1:1001, satisfies: undo_marker(1) <= start_seq < end_seq <= prior_snd_una(1001)
// BUG: 2001:3001 is treated as D-SACK then reordering is modified in tcp_sacktag_one()
+0 < . 1:1(0) ack 1001 win 320 <sack 1:1001 2001:3001 9001:10001, nop, nop>
// reordering was modified to 8
+0 %{ assert tcpi_reordering == 3, tcpi_reordering }%
-----邮件原件-----
发件人: 杨鹏程 <yangpc@...gsu.com>
发送时间: 2020年1月1日 19:47
收件人: 'Eric Dumazet' <edumazet@...gle.com>
抄送: 'David Miller' <davem@...emloft.net>; 'Alexey Kuznetsov' <kuznet@....inr.ac.ru>; 'Hideaki YOSHIFUJI' <yoshfuji@...ux-ipv6.org>; 'Alexei Starovoitov' <ast@...nel.org>; 'Daniel Borkmann' <daniel@...earbox.net>; 'Martin KaFai Lau' <kafai@...com>; 'Song Liu' <songliubraving@...com>; 'Yonghong Song' <yhs@...com>; 'andriin@...com' <andriin@...com>; 'netdev' <netdev@...r.kernel.org>; 'LKML' <linux-kernel@...r.kernel.org>
主题: Re: [PATCH] tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK
Hi Eric Dumazet,
Thanks for discussing this issue.
'previous sack segment was lost' means that the SACK segment carried by D-SACK will be processed by tcp_sacktag_one () due to the previous SACK loss, but this is not necessary.
Here is the packetdrill test, this example shows that the reordering was modified because the SACK segment was treated as D-SACK.
//dsack-old-stuff-bug.pkt
// Verify the "old stuff" D-SACK causing SACK to be treated as D-SACK
--tolerance_usecs=10000
// enable RACK and TLP
0 `sysctl -q net.ipv4.tcp_recovery=1; sysctl -q net.ipv4.tcp_early_retrans=3`
// Establish a connection, rtt = 10ms
+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <...>
+.01 < . 1:1(0) ack 1 win 320
+0 accept(3, ..., ...) = 4
// send 10 data segments
+0 write(4, ..., 10000) = 10000
+0 > P. 1:10001(10000) ack 1
// send TLP
+.02 > P. 9001:10001(1000) ack 1
// enter recovery and retransmit 1:1001, now undo_marker = 1
+.015 < . 1:1(0) ack 1 win 320 <sack 9001:10001, nop, nop>
+0 > . 1:1001(1000) ack 1
// ack 1:1001 and retransmit 1001:3001
+.01 < . 1:1001(1000) ack 1001 win 320 <sack 9001:10001, nop, nop>
+0 > . 1001:3001(2000) ack 1
// sack 2001:3001, now 2001:3001 has R|S
+.01 < . 1001:1001(0) ack 1001 win 320 <sack 2001:3001 9001:10001, nop, nop>
+0 %{ assert tcpi_reordering == 3, tcpi_reordering }%
// d-sack 1:1001, satisfies: undo_marker(1) <= start_seq < end_seq <= prior_snd_una(1001) // BUG: 2001:3001 is treated as D-SACK then reordering is modified in tcp_sacktag_one()
+0 < . 1001:1001(0) ack 1001 win 320 <sack 1:1001 2001:3001 9001:10001, nop, nop>
// reordering was modified to 8
+0 %{ assert tcpi_reordering == 3, tcpi_reordering }%
-----邮件原件-----
发件人: Eric Dumazet <edumazet@...gle.com>
发送时间: 2019年12月30日 21:41
收件人: Pengcheng Yang <yangpc@...gsu.com>
抄送: David Miller <davem@...emloft.net>; Alexey Kuznetsov <kuznet@....inr.ac.ru>; Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>; Alexei Starovoitov <ast@...nel.org>; Daniel Borkmann <daniel@...earbox.net>; Martin KaFai Lau <kafai@...com>; Song Liu <songliubraving@...com>; Yonghong Song <yhs@...com>; andriin@...com; netdev <netdev@...r.kernel.org>; LKML <linux-kernel@...r.kernel.org>
主题: Re: [PATCH] tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK
On Mon, Dec 30, 2019 at 1:55 AM Pengcheng Yang <yangpc@...gsu.com> wrote:
>
> When we receive a D-SACK, where the sequence number satisfies:
> undo_marker <= start_seq < end_seq <= prior_snd_una we
> consider this is a valid D-SACK and tcp_is_sackblock_valid() returns
> true, then this D-SACK is discarded as "old stuff", but the variable
> first_sack_index is not marked as negative in
> tcp_sacktag_write_queue().
>
> If this D-SACK also carries a SACK that needs to be processed (for
> example, the previous SACK segment was lost),
What do you mean by ' previous sack segment was lost' ?
this SACK
> will be treated as a D-SACK in the following processing of
> tcp_sacktag_write_queue(), which will eventually lead to incorrect
> updates of undo_retrans and reordering.
>
> Fixes: fd6dad616d4f ("[TCP]: Earlier SACK block verification &
> simplify access to them")
> Signed-off-by: Pengcheng Yang <yangpc@...gsu.com>
> ---
> net/ipv4/tcp_input.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index
> 88b987c..0238b55 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -1727,8 +1727,11 @@ static int tcp_sack_cache_ok(const struct tcp_sock *tp, const struct tcp_sack_bl
> }
>
> /* Ignore very old stuff early */
> - if (!after(sp[used_sacks].end_seq, prior_snd_una))
> + if (!after(sp[used_sacks].end_seq, prior_snd_una)) {
> + if (i == 0)
> + first_sack_index = -1;
> continue;
> + }
>
> used_sacks++;
> }
Hi Pengcheng Yang
This corner case deserves a packetdrill test so that we understand the issue, can you provide one ?
Thanks.
Powered by blists - more mailing lists