lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 2 Apr 2011 21:15:16 +0200
From:	Willy Tarreau <w@....eu>
To:	Cyril Bonté <cyril.bonte@...e.fr>
Cc:	Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
	Daniel Baluta <daniel.baluta@...il.com>,
	Gaspar Chilingarov <gasparch@...il.com>,
	Charles Duffy <charles@...is.net>
Subject: Re: tcp: disallow bind() to reuse addr/port regression in 2.6.38

Hi Eric,

On Sat, Apr 02, 2011 at 08:46:11PM +0200, Cyril Bonté wrote:
> Le samedi 2 avril 2011 20:10:48, Eric Dumazet a écrit :
> > Le samedi 02 avril 2011 à 20:01 +0200, Cyril Bonté a écrit :
> > (...)
> > > > 		if (shutdown(listenfd, SHUT_WR) == 0 &&
> > > 		
> > > 		    listen(listenfd, 1024) == 0 &&
> > > 		    shutdown(listenfd, SHUT_RD) == 0) {
> > > 			
> > > 			printf("shutdown OK\n");
> > > 		
> > > 		}
> > > 	
> > > 	}
> > > 	exit(0);
> > > 
> > > }
> > 
> > Wow, not clear what this is doing....
> > 
> > for sure the listen() call is not needed ?
> > 
> > And the shutdown(listenfd, SHUT_WR) is clearly useless too.
> 
> Well, I'm not the best one to explain that part but from what i read in the 
> comments of this part of code, both listen and SHUT_WR are used to detect 
> errors on various OS (OpenBSD, Solaris, ...).
> 
> > I feel you only needed the shutdown(listenfd, SHUT_RD) call.
> > 
> > Why haproxy needs to setup a second listening socket on same port ?
> 
> I simplified the test case, which is far from what haproxy do (just forgot to 
> explain the real behaviour).
> To reload the configuration, a new haproxy process is launched, sending a 
> signal to the previous one and asking it to free the ports for a while (the 
> shutdown part in the test). The new process then tries to bind the ports, 
> which worked until 2.6.38 (if an error occurs, a new signal is sent to the 
> previous process to listen to its sockets again).

Indeed, here's what normally happens when haproxy reloads.

New process is loaded with a new config. Once the config correctly parses,
it sends a signal to the previous process asking it to temporarily release
its listening ports so that the new one can bind, hence the shutdown(SHUT_RD)
performed in the old process.

Then the new process can grab the ports and listen to them. Once that's OK,
it sends another signal to the old process telling it it can go away. But
if the new process failed to completely start (eg: could not grab one port),
then it sends a third signal to the old process asking it to rebind the port
and serve them again, and the new one dies with an error.

That way, the service is never interrupted even if the new config fails
late, because the old process has the ability to rebind to the port it
temporarily released.

Now with 2.6.38, as Cyril diagnosed it, the new bind() fails when the
old process has just performed its shutdown(SHUT_RD), preventing the
new process from binding to the ports until the old process has
definitely closed them.

The behaviour is very useful, because the old process might have lost
its privileges, it will not have to rebind to the socket, just listen
on it again since it is never closed.

This is quite embarrassing, because this code used to work for the
last 10 years, at least since kernel 2.2, and maybe even 2.0, I don't
remember.

I'm not sure what the original intent of the patch was, not what was
the reported issue, but maybe we could find a way to both fix the
reported issue (if any) and restore the old behaviour in order not
to break existing programs.

Best regards,
Willy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ