linux-kernel - Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 18 Mar 2009 03:24:06 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Kevin Shanahan <kmshanah@...b.org.au>
Cc:	Avi Kivity <avi@...hat.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, Mike Galbraith <efault@....de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

On Wed, Mar 18, 2009 at 11:46:26AM +1030, Kevin Shanahan wrote:
> On Wed, 2009-03-18 at 01:20 +0100, Frederic Weisbecker wrote:
> > On Tue, Mar 17, 2009 at 09:25:37AM +1030, Kevin Shanahan wrote:
> > > On Mon, 2009-03-16 at 21:07 +0100, Frederic Weisbecker wrote:
> > > > I've looked a bit at your traces.
> > > > I think it's probably too wide to find something inside.
> > > > Latest -tip is provided with a new set of events tracing, meaning
> > > > that you will be able to produce function graph traces with various
> > > > sched events included.
> > > > 
> > > > Another thing, is it possible to reproduce it with only one ping?
> > > > Or testing perioding pings and keep only one that raised a relevant
> > > > threshold of latency? I think we could do a script that can do that.
> > > > It would make the trace much clearer.
> > > 
> > > Yeah, I think that should be possible. If you can come up with such a
> > > script, that would be great.
> > 
> > Ok, I've made a small script based on yours which could do this job.
> > You will just have to set yourself a threshold of latency
> > that you consider as buggy. I don't remember the latency you observed.
> > About 5 secs right?
> > 
> > It's the "thres" variable in the script.
> > 
> > The resulting trace should be a mixup of the function graph traces
> > and scheduler events which look like this:
> > 
> >  gnome-screensav-4691  [000]  6716.774277:   4691:120:S ==> [000]     0:140:R <idle>
> >   xfce4-terminal-4723  [001]  6716.774303:   4723:120:R   + [001]  4289:120:S Xorg
> >   xfce4-terminal-4723  [001]  6716.774417:   4723:120:S ==> [001]  4289:120:R Xorg
> >             Xorg-4289  [001]  6716.774427:   4289:120:S ==> [001]     0:140:R <idle>
> > 
> > + is a wakeup and ==> is a context switch.
> > 
> > 
> > The script will loop trying some pings and will only keep the trace that matches
> > the latency threshold you defined.
> > 
> > Tell if the following script work for you.
> 
> Yes, this looks like it will work as intended.
> 
> One thing I was thinking about though - would we need to look for a
> trace that consists of a fast ping followed by a slow ping? If we only
> keep the trace data from when we experience the slow ping, the guest
> will have already "stalled" before the trace started, so the trace won't
> indicate any of the information about how we got into that state. Is
> that information going to be important, or is it enough to just see what
> it does to get out of the stalled state?


I don't know :-)
I fear the only thing we would see by looking at a fast ping trace
is the kvm going to sleep at the end. I guess the hot black box
here is likely: what happens when we try to wake up kvm and why it is
taking so long.

May be by looking at a slow ping trace, we could follow the flow once
the kvm is supposed to be awaken. At this stage, we can perhaps
follow both the scheduler and kvm activities. Hopefully after that
we can reduce more the trace, by filtering some specific areas.

It will likely end up with some ftrace_printk() (putting specific
trace messages in defined locations)...


 
> Either way, I'll try to get some results in my maintenance window
> tonight.
>
> Cheers,
> Kevin.
> 
> > You will need to pull the latest -tip tree and enable the following:
> > 
> > CONFIG_FUNCTION_TRACER=y
> > CONFIG_FUNCTION_GRAPH_TRACER=y
> > CONFIG_DYNAMIC_FTRACE=y
> > CONFIG_SCHED_TRACER=y
> > CONFIG_CONTEXT_SWITCH_TRACER=y
> > CONFIG_EVENT_TRACER=y
> > 
> > Thanks!
> > 
> > Ah and you will need python too (since bash can't do floating point
> > operation, it uses python here).
> > 
> > #!/bin/bash
> > 
> > # Switch off all CPUs except for one to simplify the trace
> > echo 0 > /sys/devices/system/cpu/cpu1/online
> > echo 0 > /sys/devices/system/cpu/cpu2/online
> > echo 0 > /sys/devices/system/cpu/cpu3/online
> > 
> > 
> > # Make sure debugfs has been mounted
> > if [ ! -d /sys/kernel/debug/tracing ]; then
> >     mount -t debugfs debugfs /sys/kernel/debug
> > fi
> > 
> > # Set up the trace parameters
> > pushd /sys/kernel/debug/tracing || exit 1
> > echo 0 > tracing_enabled
> > echo function_graph > current_tracer
> > echo funcgraph-abstime > trace_options
> > echo funcgraph-proc    > trace_options
> > 
> > # Set here the kvm IP addr
> > addr=""
> > 
> > # Set here a threshold of latency in sec
> > thres="5"
> > found="False"
> > lat=0
> > prefix=/sys/kernel/debug/tracing
> > 
> > echo 1 > $prefix/events/sched/sched_wakeup/enable
> > echo 1 > $prefix/events/sched/sched_switch/enable
> > 
> > while [ "$found" != "True" ]
> > do
> > 	# Flush the previous buffer
> > 	echo nop > $prefix/current_tracer
> > 
> > 	# Reset the function_graph tracer
> > 	echo function_graph > $prefix/current_tracer
> > 
> > 	echo 1 > $prefix/tracing_enabled
> > 	lat=$(ping -c 1 $addr | grep rtt | grep -Eo " [0-9]+.[0-9]+")
> > 	echo 0 > $prefix/tracing_enabled
> > 
> > 	found=$(python -c "print float(str($lat).strip()) > $thres")
> > 	sleep 0.01
> > done
> > 
> > echo 0 > $prefix/events/sched/sched_wakeup/enable
> > echo 0 > $prefix/events/sched/sched_switch/enable
> > 
> > 
> > echo "Found buggy latency: $lat"
> > echo "Please send the trace you will find on $prefix/trace"
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/