[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1349161479.22107.17.camel@cr0>
Date: Tue, 02 Oct 2012 15:04:39 +0800
From: Cong Wang <amwang@...hat.com>
To: Neil Horman <nhorman@...driver.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Patrick McHardy <kaber@...sh.net>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [RFC PATCH net-next] tcp: introduce tcp_tw_interval to specifiy
the time of TIME-WAIT
On Fri, 2012-09-28 at 09:16 -0400, Neil Horman wrote:
> On Fri, Sep 28, 2012 at 02:33:07PM +0800, Cong Wang wrote:
> > On Thu, 2012-09-27 at 10:23 -0400, Neil Horman wrote:
> > > On Thu, Sep 27, 2012 at 04:41:01PM +0800, Cong Wang wrote:
> > > > Some customer requests this feature, as they stated:
> > > >
> > > > "This parameter is necessary, especially for software that continually
> > > > creates many ephemeral processes which open sockets, to avoid socket
> > > > exhaustion. In many cases, the risk of the exhaustion can be reduced by
> > > > tuning reuse interval to allow sockets to be reusable earlier.
> > > >
> > > > In commercial Unix systems, this kind of parameters, such as
> > > > tcp_timewait in AIX and tcp_time_wait_interval in HP-UX, have
> > > > already been available. Their implementations allow users to tune
> > > > how long they keep TCP connection as TIME-WAIT state on the
> > > > millisecond time scale."
> > > >
> > > > We indeed have "tcp_tw_reuse" and "tcp_tw_recycle", but these tunings
> > > > are not equivalent in that they cannot be tuned directly on the time
> > > > scale nor in a safe way, as some combinations of tunings could still
> > > > cause some problem in NAT. And, I think second scale is enough, we don't
> > > > have to make it in millisecond time scale.
> > > >
> > > I think I have a little difficultly seeing how this does anything other than
> > > pay lip service to actually having sockets spend time in TIME_WAIT state. That
> > > is to say, while I see users using this to just make the pain stop. If we wait
> > > less time than it takes to be sure that a connection isn't being reused (either
> > > by waiting two segment lifetimes, or by checking timestamps), then you might as
> > > well not wait at all. I see how its tempting to be able to say "Just don't wait
> > > as long", but it seems that theres no difference between waiting half as long as
> > > the RFC mandates, and waiting no time at all. Neither is a good idea.
> >
> > I don't think reducing TIME_WAIT is a good idea either, but there must
> > be some reason behind as several UNIX provides a microsecond-scale
> > tuning interface, or maybe in non-recycle mode, their RTO is much less
> > than 2*MSL?
> >
> My guess? Cash was the reason. I certainly wasn't there for any of those
> developments, but a setting like this just smells to me like some customer waved
> some cash under IBM's/HP's/Sun's nose and said, "We'd like to get our tcp
> sockets back to CLOSED state faster, what can you do for us?"
Yeah, maybe. But it still doesn't make sense even if they are sure their
packets are impossible to linger in their high-speed LAN for 2*MSL?
>
> > >
> > > Given the problem you're trying to solve here, I'll ask the standard question in
> > > response: How does using SO_REUSEADDR not solve the problem? Alternatively, in
> > > a pinch, why not reduce the tcp_max_tw_buckets sufficiently to start forcing
> > > TIME_WAIT sockets back into CLOSED state?
> > >
> > > The code looks fine, but the idea really doesn't seem like a good plan to me.
> > > I'm sure HPUX/Solaris/AIX/etc have done this in response to customer demand, but
> > > that doesn't make it the right solution.
> > >
> >
> > *I think* the customer doesn't want to modify their applications, so
> > that is why they don't use SO_REUSERADDR.
> >
> Well, ok, thats a legitimate distro problem. What its not is an upstream
> problem. Fixing the appilcation is the right thing to do, wether or not they
> want to.
>
> > I didn't know tcp_max_tw_buckets can do the trick, nor the customer, so
> > this is a side effect of tcp_max_tw_buckets? Is it documented?
> man 7 tcp:
> tcp_max_tw_buckets (integer; default: see below; since Linux 2.4)
> The maximum number of sockets in TIME_WAIT state allowed in the
> system. This limit exists only to prevent simple
> denial-of-service attacks. The default value of NR_FILE*2 is
> adjusted depending on the memory in the system. If this number
> is exceeded, the socket is closed and a warning is printed.
>
Hey, "a warning is printed" seems not very friendly. ;)
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists