lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 21 Mar 2009 15:30:39 +1030
From:	Kevin Shanahan <kmshanah@...b.org.au>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Avi Kivity <avi@...hat.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, Mike Galbraith <efault@....de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

On Thu, 2009-03-19 at 07:54 +1030, Kevin Shanahan wrote:
> On Wed, 2009-03-18 at 11:46 +1030, Kevin Shanahan wrote:
> > On Wed, 2009-03-18 at 01:20 +0100, Frederic Weisbecker wrote:
> > > Ok, I've made a small script based on yours which could do this job.
> > > You will just have to set yourself a threshold of latency
> > > that you consider as buggy. I don't remember the latency you observed.
> > > About 5 secs right?
> > > 
> > > It's the "thres" variable in the script.
> > > 
> > > The resulting trace should be a mixup of the function graph traces
> > > and scheduler events which look like this:
> > > 
> > >  gnome-screensav-4691  [000]  6716.774277:   4691:120:S ==> [000]     0:140:R <idle>
> > >   xfce4-terminal-4723  [001]  6716.774303:   4723:120:R   + [001]  4289:120:S Xorg
> > >   xfce4-terminal-4723  [001]  6716.774417:   4723:120:S ==> [001]  4289:120:R Xorg
> > >             Xorg-4289  [001]  6716.774427:   4289:120:S ==> [001]     0:140:R <idle>
> > > 
> > > + is a wakeup and ==> is a context switch.
> > > 
> > > The script will loop trying some pings and will only keep the trace that matches
> > > the latency threshold you defined.
> > > 
> > > Tell if the following script work for you.
> 
> ...
> 
> > Either way, I'll try to get some results in my maintenance window
> > tonight.
> 
> Testing did not go so well. I compiled and booted
> 2.6.29-rc8-tip-02630-g93c4989, but had some problems with the system
> load when I tried to start tracing - it shot up to around 16-20 or so. I
> started shutting down VMs to try and get it under control, but before I
> got back to tracing again the machine disappeared off the network -
> unresponsive to ping.
> 
> When I got in this morning, there was nothing on the console, nothing in
> the logs to show what went wrong. I will try again, but my next chance
> will probably be Saturday. Stay tuned.

Okay, new set of traces have been uploaded to:

  http://disenchant.net/tmp/bug-12465/trace-3/

These were done on the latest tip, which I pulled down this morning:
2.6.29-rc8-tip-02744-gd9937cb.

The system load was very high again when I first tried to trace with
sevarl guests running, so I ended up only having the one guest running
and thankfully the bug was still reproducable that way.

Fingers crossed this set of traces is able to tell us something.

Regards,
Kevin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ