[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <562F577E.6000901@oracle.com>
Date: Tue, 27 Oct 2015 10:52:46 +0000
From: Alan Burlison <Alan.Burlison@...cle.com>
To: Casper.Dik@...cle.com, Al Viro <viro@...IV.linux.org.uk>
CC: David Miller <davem@...emloft.net>, eric.dumazet@...il.com,
stephen@...workplumber.org, netdev@...r.kernel.org,
dholland-tech@...bsd.org
Subject: Re: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect
for sockets in accept(3)
On 27/10/2015 09:08, Casper.Dik@...cle.com wrote:
> Generally I wouldn't see that as a problem, but in the case of a socket
> blocking on accept indefinitely, I do see it as a problem especially as
> the thread actually wants to stop listening.
>
> But in general, this is basically a problem with the application: the file
> descriptor space is shared between threads and having one thread sniping
> at open files, you do have a problem and whatever the kernel does in that
> case perhaps doesn't matter all that much: the application needs to be
> fixed anyway.
The scenario in Hadoop is that the FD is being used by a thread that's
waiting in accept and another thread wants to shut it down, e.g. because
the application is terminating and needs to stop all threads cleanly. I
agree the use of shutdown()+close() on Linux or dup2() on Solaris is
pretty much an application-level hack - the concern in both cases is
that the file descriptor that's being used in the accept() might be
recycled by another thread. However that just begs the question of why
the FD isn't properly encapsulated by the application in a singleton
object, with the required shut down semantics provided by having a
mechanism to invalidate the singleton and its contained FD.
There are other mechanisms that could be used to do a clean shutdown
that don't require the OS to provide workarounds for arguably broken
application behaviour, for example by setting a 'shutdown' flag in the
object and then doing a dummy connect() to the accepting FD to kick it
off the accept() and thereby getting it to re-check the 'shutdown' flag
and not re-enter the accept().
If the object encapsulating a FD is invalidated and that prevents the FD
being used any more because the only access is via that object, then it
simply doesn't matter if the FD is reused elsewhere, there can be no
race so a complicated, platform-dependent dance isn't needed.
Unfortunately Hadoop isn't the only thing that pulls the shutdown()
trick, so I don't think there's a simple fix for this, as discussed
earlier in the thread. Having said that, if close() on Linux also did an
implicit shutdown() it would mean that well-written applications that
handled the scoping, sharing and reuse of FDs properly could just call
close() and have it work the same way across *NIX platforms.
--
Alan Burlison
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists