[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200804231136.50639.elendil@planet.nl>
Date: Wed, 23 Apr 2008 11:36:49 +0200
From: Frans Pop <elendil@...net.nl>
To: Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: linux-kernel@...r.kernel.org, Mike Galbraith <efault@....de>,
Richard Jonsson <richie@...erworld.net>
Subject: Re: [git pull] scheduler changes for v2.6.26
(Dropping Rafael, Linus and Andrew from CC.)
On Monday 21 April 2008, you wrote:
> * Frans Pop <elendil@...net.nl> wrote:
> > > It would be nice if you could try sched-devel/latest because it has
> > > an improved ftrace "sched_switch" tracer where you can generate much
> > > longer traces of this incident. Try the new /debug/trace_entries
> > > runtime tunable.
> >
> > I'll try to get the trace and will reply on the private thread we had.
> > I may need additional instructions though.
>
> you could also reply to this thread if you dont mind, so that others can
> chime in too.
OK. I must admit I was a bit surprised that things were taken private last
time. But I'll just keep following your lead :-)
> the 700-800 msecs of delays you see are very "brutal" so there must be
> something fundamentally wrong going on here.
>
> Could you first check (under sched-devel/latest) the quality of your
> sched-clock, via running this script:
> http://people.redhat.com/mingo/cfs-scheduler/tools/watch-rq-clock.sh
My clock looks OK:
1009.609822
1012.920151
1015.125826
1009.978393
1011.318990
1010.932234
1012.080720
1009.895988
1013.510765
1009.280119
1009.758037
1013.955419
1008.747707
1014.474823
Before I get to some facts from testing, first a possible tracer bug.
Trace settings:
# cat available_tracers
mmiotrace wakeup sched_switch none
# cat current_tracer
sched_switch
# cat iter_ctrl
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin noblock
nostacktrace nosched-tree
# cat trace_entries
50005
If I enable tracing with these settings, I get data in both trace *and* in
latency_trace. Is the last correct? From http://lkml.org/lkml/2008/2/10/43
I got the impression that file should only be used if the "wakeup" tracer
is active.
Right, finally to the traces and some background info.
I've run three 20 minute tests. For each I've counted the clearly audible
skips in music:
1) build without tracing, console with build running active: 17 skips
2) build without tracing, build console not visible: 16 skips
3) build with tracing, build console not visible: 11 skips
With the first build processor usage varies between 60 and 75% (over two
cores); with the others its fairly constant at 50% so 1 core in full use
and the other basically idle. All three builds do reach the same point in
the about the same time though, so the extra processor usage for the first
build is apparently just display update overhead.
The number of skips seems fairly constant. They don't seem to happen at
exactly the same points, but there are clearly points where they are more
likely. Most striking is that in all three cases I had a series of 4-6
skips at ~17-18 minutes into the build.
From the last run I've got three traces with 50000 entries (about a minute
worth each). Traces 1 and 2 should each have two skips and trace 3 should
have four. I've saved both the trace and latency_trace (ltrace) files.
They are available at: http://people.debian.org/~fjp/tmp/kernel/sched/.
BTW, did either of you actually look at the traces I sent for .24? I never
got any feedback on those.
Cheers,
FJP
P.S. I've got group scheduling active in this config as my tests with .24
showed that did not make any difference. Can rerun without if needed.
View attachment "config-2.6.26-rc0-sched-devel.git-x86-latest.git" of type "text/plain" (59238 bytes)
Powered by blists - more mailing lists