lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1232474081.4895.76.camel@kulgan.wumi.org.au>
Date:	Wed, 21 Jan 2009 04:24:41 +1030
From:	Kevin Shanahan <kmshanah@...b.org.au>
To:	Avi Kivity <avi@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>, "Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Mike Galbraith <efault@....de>,
	bugme-daemon@...zilla.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

On Tue, 2009-01-20 at 15:04 +0200, Avi Kivity wrote:
> Kevin Shanahan wrote:
> > On Tue, 2009-01-20 at 12:35 +0100, Ingo Molnar wrote:
> >> This only seems to occur under KVM, right? I.e. you tested it with -no-kvm 
> >> and the problem went away, correct?
> >>     
> >
> > Well, the I couldn't make the test conditions identical, but it the
> > problem didn't occur with the test I was able to do:
> >
> >   http://marc.info/?l=linux-kernel&m=123228728416498&w=2
> >
> >   
> 
> Can you also try with -no-kvm-irqchip?
> 
> You will need to comment out the lines
> 
>     /* ISA IRQs map to GSI 1-1 except for IRQ0 which maps
>      * to GSI 2.  GSI maps to ioapic 1-1.  This is not
>      * the cleanest way of doing it but it should work. */
> 
>     if (vector == 0)
>         vector = 2;
> 
> in qemu/hw/apic.c (should also fix -no-kvm smp).  This will change kvm 
> wakeups to use signals rather than the in-kernel code, which may be buggy.

Okay, I commented out those lines and compiled a new kvm-82 userspace
and kernel modules. Using those on a vanilla 2.6.28 kernel, with all
guests run with -no-kvm-irqchip added.

As before a number of the XP guests wanted to chug away at 100% CPU
usage for a long time. Three of the guests clocked up ~40 minutes CPU
time before I decided to just shut them down. Perhaps coincidentally,
these three guests are the only ones with Office 2003 installed on them.
That could be the difference between those guests and the other XP
guests, but that's probably not important for now.

The two Linux SMP guests booted okay this time, though they seem to only
use one CPU on the host (I guess kvm is not multi-threaded in this
mode?). "hermes-old", the guest I am pinging in all my tests, had a lot
of trouble running the apache2 setup - it was so slow it was difficult
to load a complete page from our RT system. The kvm process for this
guest was taking up 100% cpu on the host constantly and all sorts of
wierd stuff could be seen by watching top in the guest:

top - 03:44:17 up 43 min,  1 user,  load average: 3.95, 1.55, 0.80
Tasks: 101 total,   4 running,  97 sleeping,   0 stopped,   0 zombie
Cpu(s): 79.7%us, 10.4%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  9.9%si,
0.0%st
Mem:   2075428k total,   391128k used,  1684300k free,    13044k buffers
Swap:  3502160k total,        0k used,  3502160k free,   118488k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND             
 2956 postgres  20   0 19704  11m  10m S 1658  0.6   2:55.99 postmaster          
 2934 www-data  20   0 60392  40m 5132 R   31  2.0   0:17.28 apache2             
 2958 postgres  20   0 19700  11m 9.8m R   28  0.6   0:20.41 postmaster          
 2940 www-data  20   0 58652  38m 5016 S   27  1.9   0:04.87 apache2             
 2937 www-data  20   0 60124  40m 5112 S   18  2.0   0:11.00 apache2             
 2959 postgres  20   0 19132 5424 4132 S   10  0.3   0:01.50 postmaster          
 2072 postgres  20   0  8064 1416  548 S    7  0.1   0:23.71 postmaster          
 2960 postgres  20   0 19132 5368 4060 R    6  0.3   0:01.55 postmaster          
 2071 postgres  20   0  8560 1972  488 S    5  0.1   0:08.33 postmaster    

Running the ping test with without apache2 running in the guest:

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 902740ms
rtt min/avg/max/mdev = 0.568/3.745/272.558/16.990 ms

And with apache2 running:

--- hermes-old.wumi.org.au ping statistics ---
900 packets transmitted, 900 received, 0% packet loss, time 902758ms
rtt min/avg/max/mdev = 0.625/25.634/852.739/76.586 ms

In both cases it's quite variable, but the max latency is still not as
bad as when running with the irq chip enabled.

Anyway, the test is again not ideal, but I hope we're proving something.
That's all I can do for tonight - should be ready for more again
tomorrow night.

Regards,
Kevin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ