[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131002101826.GC7941@localhost.localdomain>
Date: Wed, 2 Oct 2013 12:18:28 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>,
Ingo Molnar <mingo@...nel.org>,
Namhyung Kim <namhyung.kim@....com>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jiri Olsa <jolsa@...hat.com>
Subject: Re: [PATCH 1/8] perf callchain: Convert children list to rbtree
On Thu, Sep 26, 2013 at 05:58:03PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@....com>
>
> Current collapse stage has a scalability problem which can be
> reproduced easily with parallel kernel build. This is because it
> needs to traverse every children of callchain linearly during the
> collapse/merge stage. Convert it to rbtree reduced the overhead
> significantly.
>
> On my 400MB perf.data file which recorded with make -j32 kernel build:
>
> $ time perf --no-pager report --stdio > /dev/null
>
> before:
> real 6m22.073s
> user 6m18.683s
> sys 0m0.706s
>
> after:
> real 0m20.780s
> user 0m19.962s
> sys 0m0.689s
>
> During the perf report the overhead on append_chain_children went down
> from 96.69% to 18.16%:
>
> - 18.16% perf perf [.] append_chain_children
> - append_chain_children
> - 77.48% append_chain_children
> + 69.79% merge_chain_branch
> - 22.96% append_chain_children
> + 67.44% merge_chain_branch
> + 30.15% append_chain_children
> + 2.41% callchain_append
> + 7.25% callchain_append
> + 12.26% callchain_append
> + 10.22% merge_chain_branch
> + 11.58% perf perf [.] dso__find_symbol
> + 8.02% perf perf [.] sort__comm_cmp
> + 5.48% perf libc-2.17.so [.] malloc_consolidate
>
> Reported-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Jiri Olsa <jolsa@...hat.com>
> Cc: Frederic Weisbecker <fweisbec@...il.com>
> Link: http://lkml.kernel.org/n/tip-d9tcfow6stbrp4btvgs51y67@git.kernel.org
> Signed-off-by: Namhyung Kim <namhyung@...nel.org>
Have you tested this patchset when collapsing is not used?
There are fair chances that this patchset does not only improve collapsing
but also callchain insertion in general. So it's probably a win in any case. But
still it would be nice to make sure that it's the case because we are getting
rid of collapsing anyway.
The test that could tell us about that is to run "perf report -s sym" and compare the
time it takes to complete before and after this patch, because "-s sym" shouldn't
involve collapses.
Sorting by anything that is not comm should do the trick in fact.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists