[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0705071259580.1227@turbotaz.ourhouse>
Date: Mon, 7 May 2007 13:17:49 -0500 (CDT)
From: Chase Venters <chase.venters@...entec.com>
To: Davide Libenzi <davidel@...ilserver.org>
cc: Chase Venters <chase.venters@...entec.com>,
Davi Arnaut <davi@...ent.com.br>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] rfc: threaded epoll_wait thundering herd
On Mon, 7 May 2007, Davide Libenzi wrote:
> On Mon, 7 May 2007, Chase Venters wrote:
>
>> I'm working on event handling code for multiple projects right now, and my
>> method of calling epoll_wait() is to do so from several threads. I've glanced
>> at the epoll code but obviously haven't noticed the wake-all behavior... good
>> to know. I suppose I'm going to have to hack around this problem by wrapping
>> epoll_wait() calls in a mutex. That sucks - it means other threads won't be
>> able to 'get ahead' by preparing their wait before it is their turn to dequeue
>> events.
>>
>> In any case, I think having multiple threads blocking on epoll_wait() is a
>> much saner idea than one thread which then passes out events, so I must voice
>> my support for fixing this case. Why this is the exception instead of the norm
>> is a little baffling, but I've seen so many perverse things in multi-threaded
>> code...
>
> The problem that you can have with multiple threads calling epoll_wait()
> on an SMP system, is that if you sweep 100 events in one thread, and this
> thread goes alone in processing those, you may have other CPUs idle while
> the other thread is handling those. Either you call epoll_wait() from
> multiple thread by keeping the event buffer passed to epoll_wait() farily
> limited, on you use a single epoll_wait() fetcher with a queue(s) from
> which worker threads pull from.
Working with smaller quantums is indeed the right thing to do.
In any case, let's consider why you're getting 100 events from one
epoll_wait():
1. You have a single thread doing the dequeue, and it is taking a long
time (perhaps due to the time it is taking to requeue the work in other
threads).
2. Your load is so high that you are taking lots and lots of events, in
which case the other epoll_wait() threads are going to be woken up very
soon with work anyway. In this scenario you will be "scheduling" work at
"odd" times based on its arrival, but that's just another argument to use
smaller quantums.
I'm referring specifically to edge-triggered behavior, btw. I find
edge-triggered development far easier and saner in a multi-threaded
environment, and doing level-triggered and multi-threaded at the same time
certainly seems like the wrong thing to do.
In any case, I see little point in a thread whose job is simply to move
something from queue A (epoll ready list) to queue B (thread work list).
My latest code basically uses epoll_wait() as a load balancing mechanism
to pass out work. The quantums are fairly small. There may be situations
where you get a burst of traffic that one thread handles while others are
momentarily idle, but handling that traffic is a very quick operation (and
everything is non-blocking). You really only need the other threads to
participate when the load starts to get to the point where the
epoll_wait() calls will be constantly returning anyway.
> Davi's patch will be re-factored against 22-rc1 and submitted in any case
> though.
Great. I'm just glad I saw this mail -- I probably would have burned quite
some time in the coming weeks trying to figure out why my epoll code
wasn't running quite smoothly.
>
> - Davide
>
Thanks,
Chase
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists