[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4648D381.6040808@cs.umass.edu>
Date: Mon, 14 May 2007 17:24:17 -0400
From: Ting Yang <tingy@...umass.edu>
To: Ingo Molnar <mingo@...e.hu>
CC: William Lee Irwin III <wli@...omorphy.com>,
Srivatsa Vaddagiri <vatsa@...ibm.com>, efault@....de,
linux-kernel@...r.kernel.org
Subject: Re: fair clock use in CFS
Ingo Molnar wrote:
> * William Lee Irwin III <wli@...omorphy.com> wrote:
>
>
>> On Mon, May 14, 2007 at 12:31:20PM +0200, Ingo Molnar wrote:
>>
>>> please clarify - exactly what is a mistake? Thanks,
>>>
>> The variability in ->fair_clock advancement rate was the mistake, at
>> least according to my way of thinking. [...]
>>
>
> you are quite wrong. Lets consider the following example:
>
> we have 10 tasks running (all at nice 0). The current task spends 20
> msecs on the CPU and a new task is picked. How much CPU time did that
> waiting task get entitled to during its 20 msecs wait? If fair_clock was
> constant as you suggest then we'd give it 20 msecs - but its true 'fair
> expectation' of CPU time was only 20/10 == 2 msecs!
>
Exactly, once the queue virtual time is used, all other measures should
be scaled onto virtual time for consistency, since at different time a
unit of virtual time maps different amount wall clock time.
In CFS, the p->wait_runtime is in fact the lag against the ideal system,
that is the difference between the amount of "really" done so far and
the amount of work "should" be done so far. CFS really keeps a record
for each task indicates how far away it is from the "should" case. If a
task has p->wait_runtime = 0, it must have just received the exact share
it entitled till now. Similarly negative means faster than it "should"
and positive means slower than it "should".
I guess CFS may provides better "fairness" if it controls the negative
wait_runtime can be accumulated by a task. Higher priority allows more
negative to be accumulated and low priority allows less. CFS has already
done so by scaling granularity of preemption based on weight, the only
issue is that the amount of negative wait_runtime can be accumulated is
proportional to weight, which potentially can be O(n).
It is possible to do something like this in check_preemption ?
delta = curr->fair_key - first->fair_key;
if (delta > ??? [scale it as you wish] ||
(curr->key > first->key) && (curr->wait_runtime > ???
[simple funtion of curr->weight]) )
preempt
A limit control on wait_runtime may prevent a high weight task from
running for too long, so that others get executed a little earlier. Just
a thought :-)
Ting.
Ting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists