[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <562594E1.8040403@oracle.com>
Date: Tue, 20 Oct 2015 02:12:01 +0100
From: Alan Burlison <Alan.Burlison@...cle.com>
To: Eric Dumazet <eric.dumazet@...il.com>,
Stephen Hemminger <stephen@...workplumber.org>
CC: netdev@...r.kernel.org
Subject: Re: Fw: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect
for sockets in accept(3)
On 20/10/2015 00:33, Eric Dumazet wrote:
> It looks it is a long standing problem, right ?
Yep, seems so.
> inet_shutdown() has this very specific comment from beginning of git
> tree :
>
> switch (sk->sk_state) {
> ...
> /* Remaining two branches are temporary solution for missing
> * close() in multithreaded environment. It is _not_ a good idea,
> * but we have no choice until close() is repaired at VFS level.
> */
> case TCP_LISTEN:
> if (!(how & RCV_SHUTDOWN))
> break;
> /* Fall through */
> case TCP_SYN_SENT:
> err = sk->sk_prot->disconnect(sk, O_NONBLOCK);
> sock->state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
> break;
> }
I think it's probably an intrinsic part of the way *NIX file descriptors
and their reuse has worked since the dawn of *NIX time - at which time
threads didn't exist, so this problem didn't either. The advent of
threads made this this hole possible, which is I believe what the
comment above is pointing out. The problem people are trying to solve by
calling shutdown() on an listen()ing socket is the race in MT programs
between a socket being closed and the same file descriptor being
recycled by a subsequent open()/socket() etc.
> Claiming Solaris does it differently is kind of moot.
> linux is not Solaris.
Agreed that Linux != Solaris, but the argument I'm being faced with is
that anything that doesn't behave in the same way as Linux is wrong by
definition. And the problem with that is that the Linux behaviour of
shutdown() on a listen()/accept() socket is I believe incorrect anyway
as my read of POSIX says that shutdown() is only valid on connected
sockets, and sockets in listen()/accept() aren't connected by
definition, and Linux allows shutdown() to succeed when it should
probably return ENOTCONN. Yes, there's a potential race with FDs being
recycled, but you can get that with vanilla file FDs as well, where
shutdown() isn't an option.
Another problem is that if I call close() on a Linux socket that's in
accept() the accept call just sits there until there's an incoming
connection, which succeeds even though the socket is supposed to be
closed, but then an immediately following accept() on the same socket
fails. And yet another problem is that poll() on a socket that's had
listen() called on it returns immediately even if there's no incoming
connection on it, which I believe makes multiplexing a set of sockets
which includes a socket you want to accept() on impossible. The test
program I attached to the bug allows you to play around with the
different combinations.
> Unless proven a real problem (and not only by trying to backport from Solaris to linux),
> we'll probably wont change this.
It's a real problem (with Hadoop, which contains C/C++ to do low-level
I/O) and it's the other way around, I am porting that code from Linux to
Solaris.
I accept that you probably can't change the behaviour of shutdown() in
Linux without breaking existing code, for example it seems libmicrohttpd
also assumes it's OK to call shutdown() on a listen() socket on Linux,
see https://lists.gnu.org/archive/html/libmicrohttpd/2011-09/msg00024.html
However, even if the shutdown() behaviour can't be changed the Linux
close()/poll() behaviour on a listen()/accept() sockets seems rather odd.
There *may* be a way around this that's race-free and cross-platform
involving the use of /dev/null and dup2(), see Listing Five on
http://www.drdobbs.com/parallel/file-descriptors-and-multithreaded-progr/212001285
but I haven't confirmed it works yet.
Thanks,
--
Alan Burlison
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists