[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20081120035951.EC81815423A@magilla.localdomain>
Date: Wed, 19 Nov 2008 19:59:51 -0800 (PST)
From: Roland McGrath <roland@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Ratan Nalumasu <rnalumasu@...il.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] do_wait wakeup optimization
> Patch looks sane, and look worth queueing up for the next merge window.
> But if somebody actually has numbers and/or can talk about the real-life
> load that made people even notice this, that would be good to add to the
> description.
Ratan came up with the idea. I just filled in some of the details to make
it work and clean it up. So I'll leave this explanation to him.
> Also, do we really need to call eligible_child() twice? The real wait only
> does it once in that "wait_consider_task()". Explanations would be good..
The reasons for a second call are unrelated in the thread_group_leader case
and the non-leader case.
In the thread_group_leader case, we might be doing the wakeup for a child
whose parent ignores SIGCHLD. Since it self-reaps, there will be nothing
left for do_wait() to find after it wakes up. But the wake-up is still
required. A parent that ignores SIGCHLD can do e.g.:
while (wait (NULL) > 0);
and that will block while there are any live children, then quickly fail
with ECHILD when there are none left. So, we cannot short-circuit this
wake-up, even though when do_wait() wakes up and then calls eligible_child(),
it won't match due to ->exit_signal==-1 (aka task_detached()). (Note the
second eligible_child() call is only needed when task_detached(task),
i.e. its parent ignored SIGCHLD, not the common case.)
In the non-leader case, we're dealing with the one situation where
do_notify_parent() can be called on a task other than current.
Unfortunately, in the wake_function we have no way to tell which task was
the argument to do_notify_parent(). We can only assume that it was
current, as it usually is. So we're short-circuiting if current is an
eligible child for the particular do_wait() call, not if the task passed to
do_notify_parent() is eligible.
This one case is in release_task(); the call is on current->group_leader.
So to avoid wrongly skipping the wake-up in this case, we do a second check
on the eligibility of the group_leader. We wouldn't need this if we knew
which task was the argument to the do_notify_parent() call doing the wake-up,
but I don't know how to communicate that down.
I haven't thought of something simpler that wouldn't have false negatives
for needs_wakeup().
Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists