[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070712.225950.12335719.noboru.obata.ar@hitachi.com>
Date: Thu, 12 Jul 2007 22:59:50 +0900 (JST)
From: OBATA Noboru <noboru.obata.ar@...achi.com>
To: davem@...emloft.net
Cc: shemminger@...ux-foundation.org, yoshfuji@...ux-ipv6.org,
netdev@...r.kernel.org
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
From: David Miller <davem@...emloft.net>
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
Date: Thu, 12 Jul 2007 02:37:10 -0700 (PDT)
> Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
> From: David Miller <davem@...emloft.net>
> To: noboru.obata.ar@...achi.com
> Cc: shemminger@...ux-foundation.org, yoshfuji@...ux-ipv6.org,
> netdev@...r.kernel.org
> Date: Thu, 12 Jul 2007 02:37:10 -0700 (PDT)
> X-Mailer: Mew version 5.1.52 on Emacs 21.4 / Mule 5.0 (SAKAKI)
>
> From: OBATA Noboru <noboru.obata.ar@...achi.com>
> Date: Thu, 12 Jul 2007 16:15:10 +0900 (JST)
>
> > 1. Network device layer detects a failure first and switch to a
> > backup device (say, in 20sec).
> >
> > 2. TCP layer timeout & retransmission comes next, _hopefully_
> > before the application layer timeout.
> >
> > 3. Application layer detects a network failure last (by, say,
> > 30sec timeout) and may trigger a system-level failover.
> >
> > * Note 1. The timeouts for #1 and #2 are handled
> > independently and there is no relationship between them.
> >
> > * Note 2. The actual timeout settings (20sec or 30sec in
> > this example) are often determined by systems requirement
> > and so setting them to certain "safe values" (if any) are
> > usually not possible.
> >
> > If TCP retransmission misses the time frame between event #1
> > and #3 in Background above (between 20 and 30sec since network
> > failure), a failure causes the system-level failover where the
> > network-device-level failover should be enough.
>
> I'm still totally unconvinced, this seems pointless.
>
> TCP's timeouts are perfectly fine, and the only thing you
> might be showing above is that the application timeouts
> are too short or that TCP needs notifications.
I take your comment seriously, David.
And I agree with you that TCP's timeouts are fine on a network
where congestion is a primary reason of packet loss.
But in a high-speed LAN today, for example, congestion is
effectively diminished by network capacity design, and physical
failure of devices and cables is now a major concern, which is
addressed by redundant devices and failover. TCP's timeouts
(RTT/RTO estimation and exponential backoff) work fine as well
on failover-capable networks, but I think smaller TCP_RTO_MAX is
desirable because failover can be taken place in order of
seconds. This will surely increase the usefullness of TCP on
such networks.
How do you think TCP timeouts in Linux can adapt to such changes
in network environment?
> I am totally unconvinced about your dom0 vs. domU notification
> arguments as well.
Well, I'd appreciate if you could tell me a bit more in detail
why my argument does not make sense to you.
In a virtualized environment, a failure is detected in Dom-0,
and TCP stack to be notified sits on Dom-U. I think
notifications from Dom-0 to Dom-U TCP are not easy.
Best regards,
--
OBATA Noboru (noboru.obata.ar@...achi.com)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists