[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <564BFD06.7020901@huawei.com>
Date: Wed, 18 Nov 2015 12:22:30 +0800
From: "Wangnan (F)" <wangnan0@...wei.com>
To: Jiri Olsa <jolsa@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>
CC: Jan Kratochvil <jkratoch@...hat.com>,
lkml <linux-kernel@...r.kernel.org>,
David Ahern <dsahern@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
"Ingo Molnar" <mingo@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Milian Wolff <milian.wolff@...b.com>
Subject: Re: [PATCH 0/3] perf tools DWARF libunwind: Add callchain order support
Hi Jiri,
On 2015/11/17 23:05, Jiri Olsa wrote:
> hi,
> as reported by Milian, currently for DWARF unwind (both libdw
> and libunwind) we display callchain in callee order only.
>
> Adding the support to follow callchain order setup to libunwind
> DWARF unwinder, so we could get following output for report:
>
> $ perf record --call-graph dwarf ls
> ...
> $ perf report --no-children --stdio
>
> 39.26% ls libc-2.21.so [.] __strcoll_l
> |
> ---__strcoll_l
> mpsort_with_tmp
> mpsort_with_tmp
> sort_files
> main
> __libc_start_main
> _start
> 0
>
> $ perf report -g caller --no-children --stdio
> ...
> 39.26% ls libc-2.21.so [.] __strcoll_l
> |
> ---0
> _start
> __libc_start_main
> main
> sort_files
> mpsort_with_tmp
> mpsort_with_tmp
> __strcoll_l
>
> Tested on x86_64. The change is in generic code only,
> so it should not affect other archs. Still it would be
> nice to have some confirmation.. Wang Nan? ;-)
>
> It'd be nice to have this for libdw unwind as well,
> but it looks like it's out of reach for perf code.. Jan?
>
> Also available in:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/callchain_1
Thanks for notifying me about this. I have tested it in my environment.
It works well for me except a small behavior changing. Please see below.
Before applying these patch set:
# perf report --no-children --stdio --call-graph=callee
# Overhead Command Shared Object Symbol
# ........ ....... ................ .........................
#
96.61% a.out [vdso] [.] __vdso_gettimeofday
|
---__vdso_gettimeofday
funcc
funcb
funca
main
__libc_start_main
_start
3.38% a.out a.out [.] funcc
|
---funcc
|
--2.70%-- funcb
funca
main
__libc_start_main
_start
0.02% pref_re [kernel.vmlinux] [k] sched_clock
|
---sched_clock
perf_event_nmi_handler
nmi_handle
...
And caller:
# ./perf report --no-children --stdio --call-graph=caller
# Overhead Command Shared Object Symbol
# ........ ....... ................ .........................
#
96.61% a.out [vdso] [.] __vdso_gettimeofday
|
---__vdso_gettimeofday
funcc
funcb
funca
main
__libc_start_main
_start
3.38% a.out a.out [.] funcc
|
---funcc
|
--2.70%-- funcb
funca
main
__libc_start_main
_start
0.02% pref_re [kernel.vmlinux] [k] sched_clock
|
---return_from_execve
sys_execve
do_execveat_common.isra.27
The user code part of output are identical so I confirm the bug.
After applying this patchset:
# ./perf report --no-children --stdio --call-graph=callee
# Overhead Command Shared Object Symbol
# ........ ....... ................ .........................
#
96.61% a.out [vdso] [.] __vdso_gettimeofday
|
---__vdso_gettimeofday
funcc
funcb
funca
main
__libc_start_main
_start
3.38% a.out a.out [.] funcc
|
---funcc
|
|--2.70%-- funcb
| funca
| main
| __libc_start_main
| _start
|
--0.68%-- 0
0.02% pref_re [kernel.vmlinux] [k] sched_clock
|
---sched_clock
perf_event_nmi_handler
...
And caller:
# ./perf report --no-children --stdio --call-graph=caller
# Overhead Command Shared Object Symbol
# ........ ....... ................ .........................
#
96.61% a.out [vdso] [.] __vdso_gettimeofday
|
---_start
__libc_start_main
main
funca
funcb
funcc
__vdso_gettimeofday
3.38% a.out a.out [.] funcc
|
|--2.70%-- _start
| __libc_start_main
| main
| funca
| funcb
| funcc
|
--0.68%-- 0
funcc
0.02% pref_re [kernel.vmlinux] [k] sched_clock
|
---return_from_execve
sys_execve
...
It fixes the bug. However, do you see the extra "0.68%-- 0" in the tree?
I give a message on patch 2/3, please have a look. I think this change
would be okay for me if we treat the old behavior as a bug (for example:
sum of all branches not equal to the overhead of itself). However, the
original code explicitly avoid generating '0' entry so I think we
should make it clear.
Thank you.
> thanks,
> jirka
>
>
> Cc: Jan Kratochvil <jkratoch@...hat.com>
> ---
> Jiri Olsa (3):
> perf tools: Move initial entry call into get_entries function
> perf tools: Add callchain order support for libunwind DWARF unwinder
> perf test: Add callchain order setup for DWARF unwinder test
>
> tools/perf/tests/dwarf-unwind.c | 22 +++++++++++++++++++---
> tools/perf/util/unwind-libunwind.c | 60 +++++++++++++++++++++++++++++++++++++++---------------------
> 2 files changed, 58 insertions(+), 24 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists