linux-kernel - Re: [PATCH] rfc: threaded epoll

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0705071259580.1227@turbotaz.ourhouse>
Date:	Mon, 7 May 2007 13:17:49 -0500 (CDT)
From:	Chase Venters <chase.venters@...entec.com>
To:	Davide Libenzi <davidel@...ilserver.org>
cc:	Chase Venters <chase.venters@...entec.com>,
	Davi Arnaut <davi@...ent.com.br>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] rfc: threaded epoll_wait thundering herd

On Mon, 7 May 2007, Davide Libenzi wrote:

> On Mon, 7 May 2007, Chase Venters wrote:
>
>> I'm working on event handling code for multiple projects right now, and my
>> method of calling epoll_wait() is to do so from several threads. I've glanced
>> at the epoll code but obviously haven't noticed the wake-all behavior... good
>> to know. I suppose I'm going to have to hack around this problem by wrapping
>> epoll_wait() calls in a mutex. That sucks - it means other threads won't be
>> able to 'get ahead' by preparing their wait before it is their turn to dequeue
>> events.
>>
>> In any case, I think having multiple threads blocking on epoll_wait() is a
>> much saner idea than one thread which then passes out events, so I must voice
>> my support for fixing this case. Why this is the exception instead of the norm
>> is a little baffling, but I've seen so many perverse things in multi-threaded
>> code...
>
> The problem that you can have with multiple threads calling epoll_wait()
> on an SMP system, is that if you sweep 100 events in one thread, and this
> thread goes alone in processing those, you may have other CPUs idle while
> the other thread is handling those. Either you call epoll_wait() from
> multiple thread by keeping the event buffer passed to epoll_wait() farily
> limited, on you use a single epoll_wait() fetcher with a queue(s) from
> which worker threads pull from.

Working with smaller quantums is indeed the right thing to do.

In any case, let's consider why you're getting 100 events from one 
epoll_wait():

1. You have a single thread doing the dequeue, and it is taking a long 
time (perhaps due to the time it is taking to requeue the work in other 
threads).

2. Your load is so high that you are taking lots and lots of events, in 
which case the other epoll_wait() threads are going to be woken up very 
soon with work anyway. In this scenario you will be "scheduling" work at 
"odd" times based on its arrival, but that's just another argument to use 
smaller quantums.

I'm referring specifically to edge-triggered behavior, btw. I find 
edge-triggered development far easier and saner in a multi-threaded 
environment, and doing level-triggered and multi-threaded at the same time 
certainly seems like the wrong thing to do.

In any case, I see little point in a thread whose job is simply to move 
something from queue A (epoll ready list) to queue B (thread work list). 
My latest code basically uses epoll_wait() as a load balancing mechanism 
to pass out work. The quantums are fairly small. There may be situations 
where you get a burst of traffic that one thread handles while others are 
momentarily idle, but handling that traffic is a very quick operation (and 
everything is non-blocking). You really only need the other threads to 
participate when the load starts to get to the point where the 
epoll_wait() calls will be constantly returning anyway.

> Davi's patch will be re-factored against 22-rc1 and submitted in any case
> though.

Great. I'm just glad I saw this mail -- I probably would have burned quite 
some time in the coming weeks trying to figure out why my epoll code 
wasn't running quite smoothly.

>
> - Davide
>

Thanks,
Chase
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/