[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m2l412e6f7f1004280849gf8ab11a0l6f25542014a71c38@mail.gmail.com>
Date: Wed, 28 Apr 2010 23:49:19 +0800
From: Changli Gao <xiaosuo@...il.com>
To: Jamie Lokier <jamie@...reable.org>
Cc: David Howells <dhowells@...hat.com>,
Yong Zhang <yong.zhang@...driver.com>,
Xiaotian Feng <xtfeng@...il.com>, Ingo Molnar <mingo@...e.hu>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Davide Libenzi <davidel@...ilserver.org>,
Roland Dreier <rolandd@...co.com>,
Stefan Richter <stefanr@...6.in-berlin.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <dada1@...mosbay.com>,
Christoph Lameter <cl@...ux.com>,
Andreas Herrmann <andreas.herrmann3@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Takashi Iwai <tiwai@...e.de>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC] sched: implement the exclusive wait queue as a LIFO queue
On Wed, Apr 28, 2010 at 11:25 PM, Jamie Lokier <jamie@...reable.org> wrote:
> Changli Gao wrote:
>> On Wed, Apr 28, 2010 at 9:21 PM, Jamie Lokier <jamie@...reable.org> wrote:
>> > Changli Gao wrote:
>> >>
>> >> fs/eventpoll.c: 1443.
>> >> wait.flags |= WQ_FLAG_EXCLUSIVE;
>> >> __add_wait_queue(&ep->wq, &wait);
>> >
>> > The same thing about assumptions applies here. The userspace process
>> > may be waiting for an epoll condition to get access to a resource,
>> > rather than being a worker thread interchangeable with others.
>>
>> Oh, the lines above are the current ones. So the assumptions applies
>> and works here.
>
> No, because WQ_FLAG_EXCLUSIVE doesn't have your LIFO semantic at the moment.
>
> Your patch changes the behaviour of epoll, though I don't know if it
> matters. Perhaps all programs which have multiple tasks waiting on
> the same epoll fd are "interchangeable worker thread" types anyway :-)
>
No. You are wrong. I meant epoll implemented LIFO on its own. You
should check the code. :)
>> > For example, userspace might be using a pipe as a signal-safe lock, or
>> > signal-safe multi-token semaphore, and epoll to wait for that pipe.
>> >
>> > WQ_FLAG_EXCLUSIVE means there is no point waking all tasks, to avoid a
>> > pointless thundering herd. It doesn't mean unfairness is ok.
>>
>> The users should not make any assumption about the waking up sequence,
>> neither LIFO nor FIFO.
>
> Correct, but they should be able to assume non-starvation (eventual
> progress) for all waiters.
>
> It's one of those subtle things, possibly a unixy thing: Non-RT tasks
> should always make progress when the competition is just other non-RT
> tasks, even if the progress is slow.
>
> Starvation can spread out beyond the starved process, to cause
> priority inversions in other tasks that are waiting on a resource
> locked by the starved process. Among other things, that can cause
> higher priority tasks, and RT priority tasks, to block permanently.
> Very unpleasant.
>
>> > The LIFO idea _might_ make sense for interchangeable worker-thread
>> > situations - including userspace. It would make sense for pipe
>> > waiters, socket waiters (especially accept), etc.
>>
>> Yea, and my following patches are for socket waiters.
>
> Occasionally unix socketpairs are occasionally used in the above ways too.
>
> I'm not against your patch, but I worry that starvation is a new
> semantic, and it may have a significant effect on something - either
> in the kernel, or in userspace which is harder to check.
Thanks for your reminding.
>
> I suspect it's possible to combine LIFO-ish and FIFO-ish queuing to
> prevent starvation while getting some of the locality benefit.
> Something like add-LIFO and increment a small counter in the next wait
> entry, but never add in front of an entry whose counter has reached
> MAX_LIFO_WAITERS? :-)
>
It is a little complex, and I'll keep it simple and improve it when necessary.
--
Regards,
Changli Gao(xiaosuo@...il.com)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists