[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1283859552.2338.402.camel@edumazet-laptop>
Date: Tue, 07 Sep 2010 13:39:12 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>
Cc: leandroal@...il.com, netdev@...r.kernel.org,
Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Subject: Re: TCP packet size and delivery packet decisions
Le lundi 06 septembre 2010 à 22:30 -0700, David Miller a écrit :
> The small 78 byte window is why the sending system is splitting up the
> writes into smaller pieces.
>
> I presume that the system advertises exactly a 78 byte window because
> this is how large the commands are. But this is an extremely foolish
> and baroque thing to do, and it's why you are having problems.
I am not sure why TSO added a "Bound mss with half of window"
requirement for tcp_sync_mss()
I tried with MSS=1000 and WIN=1000, and segment size chosen is 500
With WIN=78, 78/2->39 is then capped to 48 (68U - tp->tcp_header_len)
Is there a hard requirement about segment size being at most half the
window ?
Following patch solves the problem for me :
[PATCH] tcp: bound mss to window in tcp_sync_mss()
Leandro Melo de Sales noticed that if a peer announces a very small
initial tcp window (78 in his case), first sent frames have unnecessary
small lengths (48 in his case)
CLNT->SRV [SYN] Seq=0 Win=5840 Len=0 MSS=1460
SRV->CLNT [SYN, ACK] Seq=0 Ack=1 Win=78 Len=0 MSS=78
CLNT->SRV [ACK] Seq=1 Ack=1 Win=5840 Len=0
CLNT->SRV [PSH, ACK] Seq=1 Ack=1 Win=5840 Len=48
CLNT->SRV [PSH, ACK] Seq=49 Ack=1 Win=5840 Len=30
SRV->CLNT [ACK] Seq=1 Ack=49 Win=78 Len=0
SRV->CLNT [RST, ACK] Seq=1 Ack=79 Win=78 Len=0
tcp_sync_mss() bounds mss to half the window, while it could use full
window:
CLNT->SRV [SYN] Seq=0 Win=5840 Len=0 MSS=1460
SRV->CLNT [SYN, ACK] Seq=0 Ack=1 Win=78 Len=0 MSS=78
CLNT->SRV [ACK] Seq=1 Ack=1 Win=5840 Len=0
CLNT->SRV [PSH, ACK] Seq=1 Ack=1 Win=5840 Len=78
SRV->CLNT [ACK] Seq=1 Ack=79 Win=78 Len=0
Reported-by: ツ Leandro Melo de Sales <leandroal@...il.com>
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
CC: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
---
include/net/tcp.h | 9 +++++++++
net/ipv4/tcp_output.c | 2 +-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index eaa9582..c262676 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -481,6 +481,15 @@ static inline int tcp_bound_to_half_wnd(struct tcp_sock *tp, int pktsize)
return pktsize;
}
+/* Bound MSS / TSO packet size with the window */
+static inline int tcp_bound_to_wnd(struct tcp_sock *tp, int pktsize)
+{
+ if (tp->max_window && pktsize > tp->max_window)
+ return max(tp->max_window, 68U - tp->tcp_header_len);
+ else
+ return pktsize;
+}
+
/* tcp.c */
extern void tcp_get_info(struct sock *, struct tcp_info *);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index de3bd84..49cdbe4 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1224,7 +1224,7 @@ unsigned int tcp_sync_mss(struct sock *sk, u32 pmtu)
icsk->icsk_mtup.search_high = pmtu;
mss_now = tcp_mtu_to_mss(sk, pmtu);
- mss_now = tcp_bound_to_half_wnd(tp, mss_now);
+ mss_now = tcp_bound_to_wnd(tp, mss_now);
/* And store cached results */
icsk->icsk_pmtu_cookie = pmtu;
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists