[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1314250134.6797.24.camel@edumazet-laptop>
Date: Thu, 25 Aug 2011 07:28:54 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Yuchung Cheng <ycheng@...gle.com>
Cc: Hagen Paul Pfeifer <hagen@...u.net>, netdev@...r.kernel.org
Subject: Re: [PATCH] tcp: bound RTO to minimum
Le mercredi 24 août 2011 à 18:50 -0700, Yuchung Cheng a écrit :
> On Wed, Aug 24, 2011 at 4:41 PM, Hagen Paul Pfeifer <hagen@...u.net> wrote:
> > Check if calculated RTO is less then TCP_RTO_MIN. If this is true we
> > adjust the value to TCP_RTO_MIN.
> >
> but tp->rttvar is already lower-bounded via tcp_rto_min()?
>
> static inline void tcp_set_rto(struct sock *sk)
> {
> ...
>
> /* NOTE: clamping at TCP_RTO_MIN is not required, current algo
> * guarantees that rto is higher.
> */
> tcp_bound_rto(sk);
> }
Yes, and furthermore, we also limit ICMP rate, so in in my tests, I
reach in a few rounds icsk_rto > 1sec
07:16:13.010633 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 3833540215:3833540263(48) ack 2593537670 win 305
07:16:13.221111 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:13.661151 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:14.541153 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:16.301152 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
<from this point, icsk_rto=1.76sec >
07:16:18.061158 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:19.821158 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:21.581018 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:23.341156 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:25.101151 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:26.861155 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:28.621158 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:30.381152 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
07:16:32.141157 IP 10.2.1.2.59352 > 10.2.1.1.ssh: P 0:48(48) ack 1 win 305
Real question is : do we really want to process ~1000 timer interrupts
per tcp session, ~2000 skb alloc/free/build/handling, possibly ~1000 ARP
requests, only to make tcp revover in ~1sec when connectivity returns
back. This just doesnt scale.
On a server handling ~1.000.000 (long living) sessions, using
application side keepalives (say one message sent every minute on each
session), a temporary connectivity disruption _could_ makes it enter a
critical zone, burning cpu and memory.
It seems TCP-LCD (RFC6069) depends very much on ICMP being rate limited.
I'll have to check what happens on multiple sessions : We might have
cpus fighting on a single inetpeer and throtle, thus allowing backoff to
increase after all.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists