lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 21 Jan 2009 02:21:26 +1030
From:	Kevin Shanahan <kmshanah@...b.org.au>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Avi Kivity <avi@...hat.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Mike Galbraith <efault@....de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	bugme-daemon@...zilla.kernel.org
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

On Tue, 2009-01-20 at 15:25 +0100, Ingo Molnar wrote:
> > I could run top, vmstat and cat /proc/sched_debug in a loop until the
> > problem occurs and then trim it. Something like:
> > 
> > while true; do
> >   date                                >> $FILE
> >   echo "-- top: --"                   >> $FILE
> >   top -H -c -b -d 1 -n 0.5            >> $FILE 2>/dev/null
> >   echo "-- vmstat: --"                >> $FILE
> >   vmstat                              >> $FILE 2>/dev/null
> >   echo "-- sched_debug #$i: --"       >> $FILE
> >   cat /proc/sched_debug               >> $FILE 2>/dev/null
> > done
> > 
> > That should take a snapshot every half second or so.
> 
> Yeah, that would be lovely. You dont even have to trim it much - just give 
> us a timestamp to look at for the delay incident. You might also want to 
> start the kvm session while the script is already running - that way we'll 
> get fresh statistics and see the whole thing.

I've uploaded the debug info here:
  http://disenchant.net/tmp/bug-12465/

Some interesting sections should be around these times:

  01:36:04 -> 01:36:27
  01:37:30 -> 01:37:42
  01:37:52 -> 01:37:56
  01:39:37 -> 01:39:40
  01:40:01 -> 01:40:14

The output from ping is there too so you can see how the delays usually
show up (e.g. in clusters). The large debug file runs from before I
launched the VMs, right through the ping test. The trimmed file just
cuts out everything before I started ping.

Regards,
Kevin.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ