[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1326547247.5287.19.camel@edumazet-laptop>
Date: Sat, 14 Jan 2012 14:20:47 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Li Yu <raise.sail@...il.com>
Cc: linux-kernel@...r.kernel.org, davidel@...ilserver.org
Subject: Re: The thundering herd like problem when multi epolls on one fd
Le samedi 14 janvier 2012 à 19:13 +0800, Li Yu a écrit :
> Hi,
>
> My buddy reported a thundering herd problem about using epoll
> on TCP listen sockets. He said their usage like below:
>
> 1. sk = new tcp_listen_socket();
> 2. create many child processes or threads.
> 3. in new created processes (threads), use epoll API on listen
> sk to provide HTTP service.
>
> Such using pattern means we have multi wait queues when
> accepting one socket, and it is not exclusive waking up, so we get a
> thundering herd like problem. And, so I heard many popular applications
> can use such pattern, which includes nginx, lighttpd, haproxy at least.
It is not very scalable. But we really lack a fanout mechanism to allow
better paralelism on accept(), its not a poll() vs select() vs epoll()
problem per se, but a generic problem.
> So should we change this waking up behavior to exclusive too ?
>
Certainly not.
> Below is a simple patch (tested and works) for epoll() to do it,
> of course, we also should fix select() and poll() syscalls if it is right.
>
> Thanks.
>
> Yu
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 828e750..a3d6ab4 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -898,7 +899,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
> init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
> pwq->whead = whead;
> pwq->base = epi;
> - add_wait_queue(whead, &pwq->wait);
> + add_wait_queue_exclusive(whead, &pwq->wait);
> list_add_tail(&pwq->llink, &epi->pwqlist);
> epi->nwait++;
> } else {
> --
What happens if the awaken thread does not consume the event, and prefer
to exit ?
If several threads are doing select()/poll()/epoll() on a shared fd,
they _all_ must be notified the fd is ready, as manpages claim.
Doing otherwise would require the prior consent of the user, using a
special flag for example, and documentation.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists