[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100131203103.GD5224@nowhere>
Date: Sun, 31 Jan 2010 21:31:05 +0100
From: Frederic Weisbecker <fweisbec@...il.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Hitoshi Mitake <mitake@....info.waseda.ac.jp>,
linux-kernel@...r.kernel.org,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Tom Zanussi <tzanussi@...il.com>,
Jens Axboe <jens.axboe@...cle.com>,
Paul Mackerras <paulus@...ba.org>,
Anton Blanchard <anton@...ba.org>,
Mike Galbraith <efault@....de>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH 00/12 v2] perf lock: New subcommand "perf lock", for
analyzing lock statistics
On Sun, Jan 31, 2010 at 09:29:53AM +0100, Ingo Molnar wrote:
>
> FYI, i've applied a file/line-less version of 'perf lock' to perf/core today.
>
> The basic workflow is the usual:
>
> perf lock record sleep 1 # or some other command
> perf lock report # or 'perf lock trace'
>
> [ I think we can do all the things that file/line can do with a less intrusive
> (and more standard) call-site-IP based approach. For now we can key off the
> names of the locks, that's coarser but also informative and allows us to
> progress.
>
> I've renamed 'perf lock prof' to 'perf lock report' - which is more in line
> with other perf tools. ]
>
> The tool clearly needs more work at the moment: i have tried perf lock on a 16
> cpus box, and it was very slow, while it didnt really record all that many
> events to justify the slowdown. A simple:
>
> perf lock record sleep 1
>
> makes the system very slow and requires a Ctrl-C to stop:
>
> # time perf lock record sleep 1
> ^C[ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 5.204 MB perf.data (~227374 samples) ]
>
> real 0m11.941s
> user 0m0.020s
> sys 0m11.661s
>
> (The kernel config i used witht that is attached.)
>
> My suspicion is that the overhead of CONFIG_LOCK_STAT based tracing is way too
> high at the moment, and needs to be reduced. I have removed the '-R' option
> from perf lock record (which it got from perf sched where it makes sense but
> here it's not really needed and -R further increases overhead), but that has
> not solved the slowdown.
Hmm, -R is mandatory if you want the raw sample events, otherwise the
event is just a counter.
May be you mean -M ? Sure -M is going to be a noticeable overhead
in 16 cores.
Anyway, I'm looking closely into improving the lock events to
reduce all this overhead. I'll create a lock_init event so
that we can gather the heavy informations there (especially
the name of the lock).
Also, using TRACE_EVENT_FN lets us register a callback when
a tracepoint gets registered, I'm going to try to synthetize
the missing lock_init() events here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists