lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0811021258250.9451@alien.or.mcafeemobile.com>
Date:	Sun, 2 Nov 2008 13:17:13 -0800 (PST)
From:	Davide Libenzi <davidel@...ilserver.org>
To:	Olaf van der Spek <olafvdspek@...il.com>
cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: epoll behaviour after running out of descriptors

On Sun, 2 Nov 2008, Olaf van der Spek wrote:

> On Sun, Nov 2, 2008 at 8:27 PM, Davide Libenzi <davidel@...ilserver.org> wrote:
> >> I know what TIME_WAIT is. I just think it's not applicable to this situation.
> >
> > It is. You are saturating the port space, so no new POLLIN/accept events
> > are sent (until some TIME_WAIT clears), so epoll_wait() returns nothing
> > (or does not return, if INF timeo).
> > Keeping only 1K (if this is what you meant with your *only* 1K)
> > connections *alive*, does not mean the trail that does moving 1K
> > connections leave, is free.
> > If you ever played with things like httperf, you should know what I'm
> > talking about.
> 
> Wouldn't the port space require about 20+ k connects? This issue
> happens after 1 k.

The reason for "When accept returns EMFILE, I call epoll_wait and accept 
and it returns with another EMFILE." is because your sockets-close logic 
is broken. You get an event for the listening fd, you go call accept(2) 
and in one or two passes you fill up the avail fd space, then you go back 
calling epoll_wait(), and yet back to accept(2). This w/out triggering the 
file-close-relief code (yes, you fill up 1K fds *before* 30 seconds). Of 
course you get another EMFILE. When after a little while the close-loop 
triggers, likely the client quit trying, or the kernel accept backlog is 
full and no new events (remember, you chose ET) are triggered.
EMFILE is not EAGAIN, and it means that the fd can still have something 
for you. Going back to sleep with (EMFILE && ET) is bad mojo.
This is more food for linux-userspace than linux-kernel though.



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ