linux-kernel - Re: [git pull] scheduler fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1232271596.4824.29.camel@kulgan.wumi.org.au>
Date:	Sun, 18 Jan 2009 20:09:56 +1030
From:	Kevin Shanahan <kmshanah@...b.org.au>
To:	Avi Kivity <avi@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Mike Galbraith <efault@....de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	a.p.zijlstra@...llo.nl, torvalds@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [git pull] scheduler fixes

On Sun, 2009-01-18 at 10:59 +0200, Avi Kivity wrote:
> A simple test running 10 idle guests over 4 cores doesn't reproduce 
> (2.6.29-rc).  However we likely need a more complicated setup.
> 
> Kevin, any details about your setup?  What are the guests doing?  Are 
> you overcommitting memory (is the host swapping)?

Hi Avi,

No, I don't think there is any swapping. The host has 32GB RAM and only
~9GB of guest RAM has been requested in total. The host doesn't do
anything but run KVM.

I fear my test setup is quite complicated as it is a production system
and I haven't been able to reproduce a simpler case as yet. Anyway, here
are the guests I have running during the test:

Windows XP, 32-bit (UP) - There are twelve of these launched with
identical scripts, apart from differing amounts of RAM. Six with "-m
512", one with "-m 768" and five with "-m 256":

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -no-acpi -localtime -m 512 \
    -hda kvm-00-xp.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:56,model=rtl8139 \
    -net tap,vlan=0,ifname=tap0,script=no \
    -vnc 127.0.0.1:0 -usbdevice tablet \
    -daemonize

Windows 2003, 32-bit (UP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -no-acpi \
    -localtime -m 1024 \
    -hda kvm-09-2003-C.img \
    -hdb kvm-09-2003-E.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:5F,model=rtl8139 \
    -net tap,vlan=0,ifname=tap9,script=no \
    -vnc 127.0.0.1:9 -usbdevice tablet \
    -daemonize

Debian Lenny, 64-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 512 \
    -hda kvm-10-lenny64.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:60,model=rtl8139 \
    -net tap,vlan=0,ifname=tap10,script=no \
    -vnc 127.0.0.1:10 -usbdevice tablet \
    -daemonize

Debian Lenny, 32-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 512 \
    -hda kvm-15-etch-i386.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:65,model=rtl8139 \
    -net tap,vlan=0,ifname=tap15,script=no \
    -vnc 127.0.0.1:15 -usbdevice tablet \
    -daemonize

Debian Etch (using a kernel.org 2.6.27.10) 32-bit (MP):

KVM=/usr/local/kvm/bin/qemu-system-x86_64
$KVM \
    -smp 2 \
    -m 2048 \
    -hda kvm-17-1.img \
    -hdb kvm-17-tmp.img \
    -net nic,vlan=0,macaddr=52:54:00:12:34:67,model=rtl8139 \
    -net tap,vlan=0,ifname=tap17,script=no \
    -vnc 127.0.0.1:17 -usbdevice tablet \
    -daemonize

This last VM is the one I was pinging from the host.

I begin by launching the guests at 90 second intervals and then let them
settle for about 5 minutes. The final guest runs Apache with RT and a
postgresql database. I set my browser to refresh the main RT page every
2 minutes, to roughly simulate the common usage. With all the guests
basically idling and just the last guest servicing the RT requests, the
load average is about 0.25 or less. Then I just ping the last guest for
about 15 minutes.

One thing that may be of interest - a couple of times while running the
test I also had a terminal logged into the guest being pinged and
running "top". There was some crazy stuff showing up in there, with some
processes having e.g. 1200 in the "%CPU" column. I remember postmaster
(postgresql) being one of them. A quick scan over the top few tasks
might show the total "%CPU" adding up to 2000 or more. This tended to
happen at the same time I saw the latency spikes in ping as well. That
might just be an artifact of timing getting screwed up in "top", I
guess.

Regards,
Kevin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/