[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100131210707.GF5224@nowhere>
Date: Sun, 31 Jan 2010 22:07:09 +0100
From: Frederic Weisbecker <fweisbec@...il.com>
To: Jens Axboe <jens.axboe@...cle.com>
Cc: Hitoshi Mitake <mitake@....info.waseda.ac.jp>,
Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>,
Tom Zanussi <tzanussi@...il.com>,
Steven Rostedt <srostedt@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Greg Kroah-Hartman <gregkh@...e.de>
Subject: Re: [PATCH 00/12] perf lock: New subcommand "perf lock", for
analyzing lock statistics
On Sun, Jan 31, 2010 at 09:44:08PM +0100, Jens Axboe wrote:
> On Sat, Jan 30 2010, Hitoshi Mitake wrote:
> > (2010???01???29??? 23:34), Jens Axboe wrote:
> >> On Fri, Jan 22 2010, Hitoshi Mitake wrote:
> >>
> >>> Adding new subcommand "perf lock" to perf.
> >>>
> >>> I made this patch series on
> >>> latest perf/core of tip (ef12a141306c90336a3a10d40213ecd98624d274),
> >>> so please apply this series to perf/core.
> >>>
> >> [snip]
> >>
> >> I wanted to give this a go today, since I think it's pretty nifty and a
> >> lot better than using /proc/lock_stat. But it basically spirals the
> >> system into death [1]. How big a system did you test this on?
> >>
> >> [1] Got this: [ 117.097918] hrtimer: interrupt took 35093901 ns
> >>
> >>
> >
> > I tested this on Core i7 965 + 3GB DRAM machine.
> > Test program is mainly "perf bench sched messaging".
> >
> > Could you tell me the detail of your test situation?
>
> I tried to run it on a 64 thread box, on a fio job that was driving 80
> disks. It was just a quick test, but after ~20 seconds it had not even
> gotten started yet, it was still stuck in setting up the jobs and
> traversing sysfs for finding disk stats, etc. I can try something
> lighter to see if it's the cpu count or the tough job that was making it
> spiral into (near) death.
So you're experiencing a severe slowdown but it's not a lockup, right?
I think there are various reasons for that.
- The first is that lock events are high freq events, probably the highest
frequency of all events we have, and this amount of course is going to
scale with the number of cpus.
- They are suboptimally implemented. We are copying various strings
in every lock events (acquire(d), release, contended) that could be
sent once in a lock_init event. Among these strings are the name of
the lock and the __FILE__ __LINE__ of its declaration.
Ingo has applied Hitoshi's patches in tip:/perf/core few hours ago
without the __FILE__ __LINE__ infos. We are thinking about other
ways to get these infos.
The name is going to move in a new lock_init event, I'm currently
working at it.
- We are using the -M option from perf tools which multiplexes every
event buffers, which means every cpu commit their lock events in
the same buffer. Couple that with the two above reasons, it's supposed
to scale at worst.
Given that and the fact you have a 64 threads box, I'm not very
surprised of such global slowdown.
But we are working toward this and will try to make it better in this
regard.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists