[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18122.7841.346681.488204@cargo.ozlabs.ibm.com>
Date: Tue, 21 Aug 2007 09:07:13 +1000
From: Paul Mackerras <paulus@...ba.org>
To: Ingo Molnar <mingo@...e.hu>
Cc: Martin Schwidefsky <schwidefsky@...ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
Jan Glauber <jang@...ux.vnet.ibm.com>,
heiko.carstens@...ibm.com
Subject: Re: [accounting regression since rc1] scheduler updates
Ingo Molnar writes:
> We seem to agree wrt. sched_clock()'s behavior while the virtual CPU is
> busy: sched_clock() very much wants to track virtual time. (real time is
> pretty much meaningless and coupling sched_clock() to real time would
> make the virtual machine's behavior dependent on the host's load, which
> breaks the "seemless virtualization to inside observers" common-sense
> requirement of virtual-CPU scheduling.)
OK, that would imply that virtualized powerpc machines want to use the
PURR register for sched_clock(). The PURR basically counts time that
this virtual CPU (SMT thread) spends dispatching instructions, so it
excludes both the time taken by the hypervisor and the time (dispatch
cycles) taken by the other thread.
> For sched_clock()'s behavior while the virtual CPU is idle: my current
> idea for that is the patch below (a loosely analoguous problem exists
> with nohz/dynticks): it makes sched_clock() valid across idle periods
> too and uses wall-clock time for that.
The straightforward thing is just to use the PURR all the time, even
during idle. What that means is this:
* If the other thread is active then sched_clock() will continue to
advance at a slow rate during idle (reflecting the fact that the
active thread is getting almost all of the dispatch cycles).
* If the other thread is idle and the partition is a "shared
processor" partition then sched_clock() not advance since in that
case we idle in the hypervisor.
* If the other thread is idle and the partition is a "dedicated
processor" partition then sched_clock() will advance at half speed
on both threads, since the two threads each get half of the dispatch
cycles.
It sounds like this behaviour should be OK - do you agree?
> If a virtual CPU is idle then i think the "real = steal, virtual = 0"
> way of thinking about idle looks a bit unnatural to me - wouldnt it be
> better to think in terms of "steal = 0, virtual = real" ? Basically a
Stolen time while the virtual CPU is idle gets (or at least, used to
get :) accounted as idle time.
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists