[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1301773495.2837.26.camel@edumazet-laptop>
Date: Sat, 02 Apr 2011 21:44:55 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Willy Tarreau <w@....eu>
Cc: Cyril Bonté <cyril.bonte@...e.fr>,
netdev@...r.kernel.org, Daniel Baluta <daniel.baluta@...il.com>,
Gaspar Chilingarov <gasparch@...il.com>,
Charles Duffy <charles@...is.net>
Subject: Re: tcp: disallow bind() to reuse addr/port regression in 2.6.38
Le samedi 02 avril 2011 à 21:15 +0200, Willy Tarreau a écrit :
> Hi Eric,
>
> On Sat, Apr 02, 2011 at 08:46:11PM +0200, Cyril Bonté wrote:
> > Le samedi 2 avril 2011 20:10:48, Eric Dumazet a écrit :
> > > Le samedi 02 avril 2011 à 20:01 +0200, Cyril Bonté a écrit :
> > > (...)
> > > > > if (shutdown(listenfd, SHUT_WR) == 0 &&
> > > >
> > > > listen(listenfd, 1024) == 0 &&
> > > > shutdown(listenfd, SHUT_RD) == 0) {
> > > >
> > > > printf("shutdown OK\n");
> > > >
> > > > }
> > > >
> > > > }
> > > > exit(0);
> > > >
> > > > }
> > >
> > > Wow, not clear what this is doing....
> > >
> > > for sure the listen() call is not needed ?
> > >
> > > And the shutdown(listenfd, SHUT_WR) is clearly useless too.
> >
> > Well, I'm not the best one to explain that part but from what i read in the
> > comments of this part of code, both listen and SHUT_WR are used to detect
> > errors on various OS (OpenBSD, Solaris, ...).
> >
> > > I feel you only needed the shutdown(listenfd, SHUT_RD) call.
> > >
> > > Why haproxy needs to setup a second listening socket on same port ?
> >
> > I simplified the test case, which is far from what haproxy do (just forgot to
> > explain the real behaviour).
> > To reload the configuration, a new haproxy process is launched, sending a
> > signal to the previous one and asking it to free the ports for a while (the
> > shutdown part in the test). The new process then tries to bind the ports,
> > which worked until 2.6.38 (if an error occurs, a new signal is sent to the
> > previous process to listen to its sockets again).
>
> Indeed, here's what normally happens when haproxy reloads.
>
> New process is loaded with a new config. Once the config correctly parses,
> it sends a signal to the previous process asking it to temporarily release
> its listening ports so that the new one can bind, hence the shutdown(SHUT_RD)
> performed in the old process.
>
> Then the new process can grab the ports and listen to them. Once that's OK,
> it sends another signal to the old process telling it it can go away. But
> if the new process failed to completely start (eg: could not grab one port),
> then it sends a third signal to the old process asking it to rebind the port
> and serve them again, and the new one dies with an error.
>
> That way, the service is never interrupted even if the new config fails
> late, because the old process has the ability to rebind to the port it
> temporarily released.
>
> Now with 2.6.38, as Cyril diagnosed it, the new bind() fails when the
> old process has just performed its shutdown(SHUT_RD), preventing the
> new process from binding to the ports until the old process has
> definitely closed them.
>
> The behaviour is very useful, because the old process might have lost
> its privileges, it will not have to rebind to the socket, just listen
> on it again since it is never closed.
>
> This is quite embarrassing, because this code used to work for the
> last 10 years, at least since kernel 2.2, and maybe even 2.0, I don't
> remember.
>
> I'm not sure what the original intent of the patch was, not what was
> the reported issue, but maybe we could find a way to both fix the
> reported issue (if any) and restore the old behaviour in order not
> to break existing programs.
>
> Best regards,
> Willy
>
I wish it was that simple....
http://www.spinics.net/lists/netdev/msg151551.html
Is Cyril program running OK on FreeBsd ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists