[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071026224622.GH8181@ftp.linux.org.uk>
Date: Fri, 26 Oct 2007 23:46:22 +0100
From: Al Viro <viro@....linux.org.uk>
To: Stephen Hemminger <shemminger@...ux-foundation.org>
Cc: "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: Files, sockets, and closing
On Fri, Oct 26, 2007 at 03:09:01PM -0700, Stephen Hemminger wrote:
> > close() from another thread is not a way to abort blocked accept(). Never
> > promised to be that. Just as close() from another thread is not a way to
> > abort blocked write() or read() or sendmsg() or...
>
> The problem is the Linux interpretation conflicts with the expectation
> of applications that run on other Unix systems. Most likely, it is
> one of those corner cases not covered by SUS or Posix specs otherwise
> it would have come up earlier. The existing Linux behavior works fine
> it just isn't expected (or well documented).
>
> I'm fine with just closing the bug (which is what I did initially), but
> where should this get documented?
close(2), perhaps? "System call on opened file holds a reference to
opened file regardless of what happens to descriptor originally passed
to it" or something to the same effect...
That's what really happens - you get the same effect as if there had been
an additional temporary opened descriptor for that sucker. And really,
multithreaded application that has one thread rip descriptors from under
another should be damn careful on _any_ system. Anything that goes
"I've got -EBADF, guess another thread had removed that descriptor,
got to recover" is insane - in effect, it calls accept() blindly and
hopes that race will play out nicely, without hitting
* thread A calls accept(3)
* thread B calls close()
* thread B calls e.g. dup() for unrelated reason and gets the same
descriptor reused
* thread A finally gets from libc to accept(2), sees no EBADF and
proceeds with accept() on completely unrelated socket, with no indication of
the problem (or returns giving you a bogus errno, depending on what the
hell that descriptor happens to be).
IOW, if you rely on -EBADF to deal with such (userland) races, you are
extremely likely to be screwed. On Linux, on FreeBSD, on Solaris, whatever.
In very controlled circumstances you might get away with that, but it's
almost certainly a Very Bad Idea(tm).
The bottom line: if descriptor table is a shared resource in your
multithreaded program, treat it as such. Kernel will survive having
descriptors closed in the middle of syscall just fine; your userland
code is a different story.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists