lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Sun, 15 Jan 2012 23:41:37 +0800
From:	Li Yu <raise.sail@...il.com>
To:	eric.dumazet@...il.com
CC:	linux-kernel@...r.kernel.org, davidel@...ilserver.org
Subject: Re: The thundering herd like problem when multi epolls on one fd



2012/1/14 Eric Dumazet <eric.dumazet@...il.com>:
> Le samedi 14 janvier 2012 à 19:13 +0800, Li Yu a écrit :
>> Hi,
>>
>>       My buddy reported a thundering herd problem about using epoll
>> on TCP listen sockets. He said their usage like below:
>>
>>       1. sk = new tcp_listen_socket();
>>       2. create many child processes or threads.
>>       3. in new created processes (threads), use epoll API on listen
>> sk to provide HTTP service.
>>
>>       Such using pattern means we have multi wait queues when
>> accepting one socket, and it is not exclusive waking up, so we get a
>> thundering herd like problem. And, so I heard many popular applications
>> can use such pattern, which includes nginx, lighttpd, haproxy at least.
>
> It is not very scalable. But we really lack a fanout mechanism to allow
> better paralelism on accept(), its not a poll() vs select() vs epoll()
> problem per se, but a generic problem.
>

I am interesting in this issue, my rough idea is it may utilize XPS or
RPS/RSS information to detect which tasks on target processor to
wake up,

>> So should we change this waking up behavior to exclusive too ?
>>
>
> Certainly not.
>
>>       Below is a simple patch (tested and works) for epoll() to do it,
>> of course, we also should fix select() and poll() syscalls if it is
right.
>>
>>       Thanks.
>>
>> Yu
>>
>> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
>> index 828e750..a3d6ab4 100644
>> --- a/fs/eventpoll.c
>> +++ b/fs/eventpoll.c
>> @@ -898,7 +899,7 @@ static void ep_ptable_queue_proc(struct file
*file, wait_queue_head_t *whead,
>>                 init_waitqueue_func_entry(&pwq->wait, ep_poll_callback);
>>                 pwq->whead = whead;
>>                 pwq->base = epi;
>> -               add_wait_queue(whead, &pwq->wait);
>> +               add_wait_queue_exclusive(whead, &pwq->wait);
>>                 list_add_tail(&pwq->llink, &epi->pwqlist);
>>                 epi->nwait++;
>>         } else {
>> --
>
>
> What happens if the awaken thread does not consume the event, and prefer
> to exit ?

In my words, If so, it should be think as a bug in application.

>
> If several threads are doing select()/poll()/epoll() on a shared fd,
> they _all_ must be notified the fd is ready, as manpages claim.
>
> Doing otherwise would require the prior consent of the user, using a
> special flag for example, and documentation.
>

Indeed, thanks!

Yu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ