[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47543EEC.7010900@nortel.com>
Date: Mon, 03 Dec 2007 11:37:48 -0600
From: "Chris Friesen" <cfriesen@...tel.com>
To: davids@...master.com
CC: Nick Piggin <nickpiggin@...oo.com.au>, Ingo Molnar <mingo@...e.hu>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
Arjan van de Ven <arjan@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: sched_yield: delete sysctl_sched_compat_yield
David Schwartz wrote:
> I've asked versions of this question at least three times and never gotten
> anything approaching a straight answer:
>
> 1) What is the current default 'sched_yield' behavior?
>
> 2) What is the current alternate 'sched_yield' behavior?
I'm pretty sure I've seen responses from Ingo describing this multiple
times in various threads. Google should have them.
If I remember right, the default is to simply recalculate the task's
position in the tree and reinsert it, and the alternate is to yield to
everything currently runnable.
> 3) Are either of them sensible? Simply acting as if the current thread's
> timeslice was up should be sufficient.
The new scheduler doesn't really have a concept of "timeslice". This is
one of the core problems with determining what to do on sched_yield().
> The implication I keep getting is that neither the default behavior nor the
> alternate behavior are sensible. What is so hard about simply scheduling the
> next thread?
The problem is where do we insert the task that is yielding? CFS is
based around a tree structure ordered by time.
The old scheduler was priority-based, so you could essentially yield to
everyone of the same niceness level.
With the new scheduler, this would be possible, but would involve extra
work tracking the position of the rightmost task at each priority level.
This additional overhead is what Ingo is trying to avoid.
> We don't need perfection, but it sounds like we have two alternatives of
> which neither is sensible.
sched_yield() isn't a great API. It just says to delay the task,
without specifying how long or what the task is waiting *for*. Other
constructs are much more useful because they give the scheduler more
information with which to make a decision.
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists