[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwfFb=LXU77AbiPDHgWcpBwTBoJB4EMCZgTgX32cxMYWw@mail.gmail.com>
Date: Fri, 30 Oct 2015 16:52:41 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
David Miller <davem@...emloft.net>,
Stephen Hemminger <stephen@...workplumber.org>,
Network Development <netdev@...r.kernel.org>,
David Howells <dhowells@...hat.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [Bug 106241] New: shutdown(3)/close(3) behaviour is incorrect for
sockets in accept(3)
On Fri, Oct 30, 2015 at 3:33 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> On Fri, Oct 30, 2015 at 02:50:46PM -0700, Linus Torvalds wrote:
>
>> Anyway. This is a pretty simple patch, and I actually think that we
>> could just get rid of the "next_fd" logic entirely with this. That
>> would make this *patch* be more complicated, but it would make the
>> resulting *code* be simpler.
>
> Dropping next_fd would screw you in case of strictly sequential allocations...
I don't think it would matter in real life, since I don't really think
you have lots of fd's with strictly sequential behavior.
That said, the trivial "open lots of fds" benchmark would show it, so
I guess we can just keep it. The next_fd logic is not expensive or
complex, after all.
> Your point re power-of-two allocations is well-taken, but then I'm not
> sure that kzalloc() is good enough here.
Attached is an updated patch that just uses the regular bitmap
allocator and extends it to also have the bitmap of bitmaps. It
actually simplifies the patch, so I guess it's better this way.
Anyway, I've tested it all a bit more, and for a trivial worst-case
stress program that explicitly kills the next_fd logic by doing
for (i = 0; i < 1000000; i++) {
close(3);
dup2(0,3);
if (dup(0))
break;
}
it takes it down from roughly 10s to 0.2s. So the patch is quite
noticeable on that kind of pattern.
NOTE! You'll obviously need to increase your limits to actually be
able to do the above with lots of file descriptors.
I ran Eric's test-program too, and find_next_zero_bit() dropped to a
fraction of a percent. It's not entirely gone, but it's down in the
noise.
I really suspect this patch is "good enough" in reality, and I would
*much* rather do something like this than add a new non-POSIX flag
that people have to update their binaries for. I agree with Eric that
*some* people will do so, but it's still the wrong thing to do. Let's
just make performance with the normal semantics be good enough that we
don't need to play odd special games.
Eric?
Linus
View attachment "patch.diff" of type "text/plain" (4590 bytes)
Powered by blists - more mailing lists