[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871vzhrvtv.fsf@szonett.ki.iif.hu>
Date: Thu, 18 Sep 2008 23:24:44 +0200
From: Ferenc Wagner <wferi@...f.hu>
To: Jeremy Fitzhardinge <jeremy@...p.org>
Cc: linux-kernel@...r.kernel.org,
Xen-devel <xen-devel@...ts.xensource.com>
Subject: Re: doubled idle count
Jeremy Fitzhardinge <jeremy@...p.org> writes:
> Ferenc Wagner wrote:
>
>> After upgrading a Xen virtual machine to Debian's 2.6.26-4 kernel, I
>> noticed that the idle counter doubled its pace on one of the machines:
>
> Sorry, could you be clearer here? What are we looking at?
Sure, see below:
>> service2:~# yes >/dev/null &
>> [1] 32578
Here I made the processor (a single virtual one) fully busy.
>> service2:~# grep cpu0 /proc/stat; sleep 1; grep cpu0 /proc/stat
>> cpu0 141208 9113 57273 13379659 61012 0 792 2350 0
>> cpu0 141310 9113 57274 13379659 61012 0 792 2350 0
Above, the difference of the first numbers is 102, which, given that
USER_HZ=100, means that the CPU spent all its cycles during the sleep
processing user code (the yes running in the background).
>> service2:~# fg
>> yes > /dev/null
>> ^C
Now the CPU is idle again, I killed the yes started above.
>> service2:~# grep cpu0 /proc/stat; sleep 1; grep cpu0 /proc/stat
>> cpu0 141952 9113 57277 13383481 61012 0 792 2350 0
>> cpu0 141953 9113 57278 13383681 61012 0 792 2350 0
And here, the difference of the fourth numbers is 200, meaning that
the processor spent 200% of its time in idle state during this second!
(If I read the procfs documentation correctly, of course.)
This seems wrong by a factor of two, as there are only 100 "ticks" in
a second (actually, this kernel is tickless, but USER_HZ=100, as I'm
running a 686 kernel).
>> One out of three machines show this effect, with the exact same kernel
>> and Xen versions (3.2.0, dom0 is Debian's stock Etch 2.6.18 kernel).
>> They aren't hosted by the same machine, though: the misbehaving one is
>> on a different installation with very similar hardware (3 vs 2 GHz).
>> All the guest are paravirtual.
>
> So you're saying that they are identical Xen and guest kernel binaries,
> but one of three is showing doubled idle time?
Yes.
> That seems unlikely.
I was very much surpised myself, too... The version numbers surely
are the same, but the binaries came from different downloads. I'll
compare them, and also start another domU next to the misbehaving one.
> The source of that time is from Xen itself, and I think it should be
> hardware independent, though I guess its possible there's something
> going on at in the Xen-level timekeeping.
>
> That said, I think there's some chance that stolen time may get counted
> as idle time. Does the one machine with a different outcome have
> something else running in another virtual machine (including dom0)?
Yes, both Xen instances run other domUs, and at abount one on both
consumes significant CPU. The other domUs are mostly idle, and the
dom0s too.
--
Thanks for taking time,
Feri.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists