netdev - Re: [PATCH 2.6.22-rc5] TCP: Make TCP_RTO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070712.155614.91275946.noboru.obata.ar@hitachi.com>
Date:	Thu, 12 Jul 2007 15:56:14 +0900 (JST)
From:	OBATA Noboru <noboru.obata.ar@...achi.com>
To:	ian.mcdonald@...di.co.nz
Cc:	davem@...emloft.net, shemminger@...ux-foundation.org,
	yoshfuji@...ux-ipv6.org, netdev@...r.kernel.org
Subject: Re: [PATCH 2.6.22-rc5] TCP: Make TCP_RTO_MAX a variable

From: "Ian McDonald" <ian.mcdonald@...di.co.nz>
Subject: [MaybeSpam] Re: [PATCH 2.6.22-rc5] TCP: Make TCP_RTO_MAX a variable
Date: Tue, 26 Jun 2007 10:18:46 +1200

> On 6/26/07, OBATA Noboru <noboru.obata.ar@...achi.com> wrote:
> > From: OBATA Noboru <noboru.obata.ar@...achi.com>
> >
> > Make TCP_RTO_MAX a variable, and allow a user to change it via a
> > new sysctl entry /proc/sys/net/ipv4/tcp_rto_max.  A user can
> > then guarantee TCP retransmission to be more controllable, say,
> > at least once per 10 seconds, by setting it to 10.  This is
> > quite helpful on failover-capable network devices, such as an
> > active-backup bonding device.  On such devices, it is desirable
> > that TCP retransmits a packet shortly after the failover, which
> > is what I would like to do with this patch.  Please see
> > Background and Problem below for rationale in detail.
> >
> RFC2988 says this:
>    (2.4) Whenever RTO is computed, if it is less than 1 second then the
>          RTO SHOULD be rounded up to 1 second.
> 
>          Traditionally, TCP implementations use coarse grain clocks to
>          measure the RTT and trigger the RTO, which imposes a large
>          minimum value on the RTO.  Research suggests that a large
>          minimum RTO is needed to keep TCP conservative and avoid
>          spurious retransmissions [AP99].  Therefore, this
>          specification requires a large minimum RTO as a conservative
>          approach, while at the same time acknowledging that at some
>          future point, research may show that a smaller minimum RTO is
>          acceptable or superior.
> 
>    (2.5) A maximum value MAY be placed on RTO provided it is at least 60
>          seconds.
> 
> Your code doesn't seem to meet requirements of section 2.5 as your
> minimum is 1 second.
> 
> I think if you're trying to solve the bonding issue then you should
> solve that issue, not hack the TCP implementation as that opens it up
> to abuse in other ways.

I think this is rather a new problem, or requirement, in the
combined case "TCP on a failover-capable network device," and
not easily solved only by bonding.

A notify mechanism from bonding to TCP is suggested, but I think
it is really hard to do it in the virtualized environment like
Xen.  Hypervisor (Dom-0) takes care of physical devices,
including bonding, and guests (Dom-U) handle TCP.  Notifying
from bonding in Dom-0 to TCP in Dom-U is really a challenge.

My problem (TCP retransmission may not be done in the expected
time frame, e.x., 10 seconds after a bonding failover) still
occurs in such an environment, and my code (capping TCP_RTO_MAX)
still works on VM environment.

So solving this in TCP layer makes sense to me.

Regards,

-- 
OBATA Noboru (noboru.obata.ar@...achi.com)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html