[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070420064600.GA24614@elte.hu>
Date: Fri, 20 Apr 2007 08:46:00 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Peter Williams <pwil3058@...pond.net.au>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Con Kolivas <kernel@...ivas.org>,
Nick Piggin <npiggin@...e.de>, Mike Galbraith <efault@....de>,
Arjan van de Ven <arjan@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, caglar@...dus.org.tr,
Willy Tarreau <w@....eu>, Gene Heskett <gene.heskett@...il.com>
Subject: Re: [patch] CFS scheduler, v3
* Peter Williams <pwil3058@...pond.net.au> wrote:
> > - bugfix: use constant offset factor for nice levels instead of
> > sched_granularity_ns. Thus nice levels work even if someone sets
> > sched_granularity_ns to 0. NOTE: nice support is still naive, i'll
> > address the many nice level related suggestions in -v4.
>
> I have a suggestion I'd like to make that addresses both nice and
> fairness at the same time. As I understand the basic principle behind
> this scheduler it to work out a time by which a task should make it
> onto the CPU and then place it into an ordered list (based on this
> value) of tasks waiting for the CPU. I think that this is a great idea
> [...]
yes, that's exactly the main idea behind CFS, and thanks for the
compliment :)
Under this concept the scheduler never really has to guess: every
scheduler decision derives straight from the relatively simple
one-sentence (!) scheduling concept outlined above. Everything that
tasks 'get' is something they 'earned' before and all the scheduler does
are micro-decisions based on math with the nanosec-granularity values.
Both the rbtree and nanosec accounting are a straight consequence of
this too: they are the tools that allow the implementation of this
concept in the highest-quality way. It's certainly a very exciting
experiment to me and the feedback 'from the field' is very promising so
far.
> [...] and my suggestion is with regard to a method for working out
> this time that takes into account both fairness and nice.
>
> First suppose we have the following metrics available in addition to
> what's already provided.
>
> rq->avg_weight_load /* a running average of the weighted load on the
> CPU */ p->avg_cpu_per_cycle /* the average time in nsecs that p spends
> on the CPU each scheduling cycle */
yes. rq->nr_running is really just a first-level approximation of
rq->raw_weighted_load. I concentrated on the 'nice 0' case initially.
> I appreciate that the notion of basing the expected wait on the task's
> average cpu use per scheduling cycle is counter intuitive but I
> believe that (if you think about it) you'll see that it actually makes
> sense.
hm. So far i tried to not do any statistical approach anywhere: the
p->wait_runtime metric (which drives the task ordering) is in essence an
absolutely precise 'integral' of the 'expected runtimes' that the task
observes and hence is a precise "load-average as observed by the task"
in itself. Every time we base some metric on an average value we
introduce noise into the system.
i definitely agree with your suggestion that CFS should use a
nice-scaled metric for 'load' instead of the current rq->nr_running, but
regarding the basic calculations i'd rather lean towards using
rq->raw_weighted_load. Hm?
your suggestion concentrates on the following scenario: if a task
happens to schedule in an 'unlucky' way and happens to hit a busy period
while there are many idle periods. Unless i misunderstood your
suggestion, that is the main intention behind it, correct?
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists