[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070413225216.GA11384@elte.hu>
Date: Sat, 14 Apr 2007 00:52:16 +0200
From: Ingo Molnar <mingo@...e.hu>
To: William Lee Irwin III <wli@...omorphy.com>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Con Kolivas <kernel@...ivas.org>,
Nick Piggin <npiggin@...e.de>, Mike Galbraith <efault@....de>,
Arjan van de Ven <arjan@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
* William Lee Irwin III <wli@...omorphy.com> wrote:
> On Fri, Apr 13, 2007 at 10:21:00PM +0200, Ingo Molnar wrote:
> > [announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
> > i'm pleased to announce the first release of the "Modular Scheduler Core
> > and Completely Fair Scheduler [CFS]" patchset:
> > http://redhat.com/~mingo/cfs-scheduler/sched-modular+cfs.patch
> > This project is a complete rewrite of the Linux task scheduler. My goal
> > is to address various feature requests and to fix deficiencies in the
> > vanilla scheduler that were suggested/found in the past few years, both
> > for desktop scheduling and for server scheduling workloads.
> > [ QuickStart: apply the patch to v2.6.21-rc6, recompile, reboot. The
> > new scheduler will be active by default and all tasks will default
> > to the new SCHED_FAIR interactive scheduling class. ]
>
> A pleasant surprise, though I did see it coming.
hey ;)
> On Fri, Apr 13, 2007 at 10:21:00PM +0200, Ingo Molnar wrote:
> > Highlights are:
> > - the introduction of Scheduling Classes: an extensible hierarchy of
> > scheduler modules. These modules encapsulate scheduling policy
> > details and are handled by the scheduler core without the core
> > code assuming about them too much.
>
> It probably needs further clarification that they're things on the
> order of SCHED_FIFO, SCHED_RR, SCHED_NORMAL, etc.; some prioritization
> amongst the classes is furthermore assumed, and so on. [...]
yep - they are linked via sched_ops->next pointer, with NULL delimiting
the last one.
> [...] They're not quite capable of being full-blown alternative
> policies, though quite a bit can be crammed into them.
yeah, they are not full-blown: i extended them on-demand, for the
specific purposes of sched_fair.c and sched_rt.c. More can be done too.
> There are issues with the per- scheduling class data not being very
> well-abstracted. [...]
yes. It's on my TODO list: i'll work more on extending the cleanups to
those fields too.
> A binomial heap would likely serve your purposes better than rbtrees.
> It's faster to have the next item to dequeue at the root of the tree
> structure rather than a leaf, for one. There are, of course, other
> priority queue structures (e.g. van Emde Boas) able to exploit the
> limited precision of the priority key for faster asymptotics, though
> actual performance is an open question.
i'm caching the leftmost leaf, which serves as an alternate, task-pick
centric root in essence.
> Another advantage of heaps is that they support decreasing priorities
> directly, so that instead of removal and reinsertion, a less invasive
> movement within the tree is possible. This nets additional constant
> factor improvements beyond those for the next item to dequeue for the
> case where a task remains runnable, but is preempted and its priority
> decreased while it remains runnable.
yeah. (Note that in CFS i'm not decreasing priorities anywhere though -
all the priority levels in CFS stay constant, fairness is not achieved
via rotating priorities or similar, it is achieved via the accounting
code.)
> On Fri, Apr 13, 2007 at 10:21:00PM +0200, Ingo Molnar wrote:
> > due to its design, the CFS scheduler is not prone to any of the
> > 'attacks' that exist today against the heuristics of the stock
> > scheduler: fiftyp.c, thud.c, chew.c, ring-test.c, massive_intr.c all
> > work fine and do not impact interactivity and produce the expected
> > behavior.
>
> I'm always suspicious of these claims. [...]
hey, sure - but please give it a go nevertheless, i _did_ test all these
;)
> A moderately formal regression test suite needs to be assembled [...]
by all means feel free! ;)
> A more general question here is what you mean by "completely fair;"
by that i mean the most common-sense definition: with N tasks running
each gets 1/N CPU time if observed for a reasonable amount of time. Now
extend this to arbitrary scheduling patterns, the end result should
still be completely fair, according to the fundamental 1/N(time) rule
individually applied to all the small scheduling patterns that the
scheduling patterns give. (this assumes that the scheduling patterns are
reasonably independent of each other - if they are not then there's no
reasonable definition of fairness that makes sense, and we might as well
use the 1/N rule for those cases too.)
> there doesn't appear to be inter-tgrp, inter-pgrp, inter-session, or
> inter-user fairness going on, though one might argue those are
> relatively obscure notions of fairness. [...]
sure, i mainly concentrated on what we have in Linux today. The things
you mention are add-ons that i can see handling via new scheduling
classes: all the CKRM and containers type of CPU time management
facilities.
> What these things mean when there are multiple CPU's to schedule
> across may also be of concern.
that is handled by the existing smp-nice load balancer, that logic is
preserved under CFS.
> These testcases are oblivious to SMP. This will demand that a
> scheduling policy integrate with load balancing to the extent that
> load balancing occurs for the sake of distributing CPU bandwidth
> according to nice level. Some explicit decision should be made
> regarding that.
this should already work reasonably fine with CFS: try massive_intr.c on
an SMP box.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists