[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070821093434.GB12025@elte.hu>
Date: Tue, 21 Aug 2007 11:34:34 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Martin Schwidefsky <schwidefsky@...ibm.com>
Cc: Christian Borntraeger <borntraeger@...ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
Jan Glauber <jang@...ux.vnet.ibm.com>,
heiko.carstens@...ibm.com, Paul Mackerras <paulus@...ba.org>
Subject: Re: [accounting regression since rc1] scheduler updates
* Martin Schwidefsky <schwidefsky@...ibm.com> wrote:
> > hm, does on s390 scheduler_tick() get driven in virtual time or in
> > real time? The very latest scheduler code will enforce a minimum
> > rate of sched_clock() across two scheduler_tick() calls (in rc3 and
> > later kernels). If sched_clock() "slows down" but scheduler_tick()
> > still has a real-time frequency then that impacts the quality of
> > scheduling. So scheduler_tick() and sched_clock() must really have
> > the same behavior (either both are virtual or both are real), so
> > that scheduling becomes invariant to steal-time.
>
> scheduler_tick() is based on the HZ timer which uses the TOD clock =
> real time. sched_clock() currently uses the TOD clock as well so in
> regard to the new scheduler we currently do not have a problem. We
> have a problem with cpu time accounting, the change to the /proc code
> breaks the precise accounting on s390. To solve the cpu time
> accounting we need to change sched_clock() to the cpu timer = virtual
> time. To change the scheduler_tick() as well requires another patch
> and I fear it would complicate things in the s390 backend.
my feeling is that it gives us generally higher-quality scheduling if we
drive all things scheduler via virtual time. Do you agree with that?
> And if you say that the scheduling becomes invariant to steal-time,
> how is the cpu time accounting via sum_exec supposed to work if it
> does not take steal-time into account ?
right now there are two distinct and independent things: scheduler
behavior (the scheduling decisions the scheduler makes) and accounting
behavior.
the 'invariant' i mentioned only covers scheduler behavior, not
accounting behavior. Accounting is separate in theory, but coupled in
practice now via sum_exec_runtime.
Before we do a patch to decouple them again, lets make sure we agree on
the direction to take here. There are two ways to account within a
virtual machine: either in real time or in virtual time.
it seems you'd like accounting to be sensitive to 'external load' - i.e.
you'd like an 'internal' top to show the 'real' CPU accounting, right?
Wouldnt it be more consistent if a virtual box would not show any
dependency on external load? (i.e. it would slow down all of its
internal functionality transparently, without exposing it via /proc. The
only way to observe that would be the TOD interfaces: gettimeofday and
real-time clock driven POSIX timers. Even timer_list could be driven via
virtual time - although that would probably break user expectations,
right?) Or would accounting-in-virtual-time break user expectations too?
(most of the other hypervisors let guests account in virtual time.)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists