linux-kernel - Re: [PATCH 0/3] perf tools DWARF libunwind: Add callchain order support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <564BFD06.7020901@huawei.com>
Date:	Wed, 18 Nov 2015 12:22:30 +0800
From:	"Wangnan (F)" <wangnan0@...wei.com>
To:	Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>
CC:	Jan Kratochvil <jkratoch@...hat.com>,
	lkml <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"Ingo Molnar" <mingo@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Milian Wolff <milian.wolff@...b.com>
Subject: Re: [PATCH 0/3] perf tools DWARF libunwind: Add callchain order support

Hi Jiri,


On 2015/11/17 23:05, Jiri Olsa wrote:
> hi,
> as reported by Milian, currently for DWARF unwind (both libdw
> and libunwind) we display callchain in callee order only.
>
> Adding the support to follow callchain order setup to libunwind
> DWARF unwinder, so we could get following output for report:
>
>    $ perf record --call-graph dwarf ls
>    ...
>    $ perf report --no-children --stdio
>
>      39.26%  ls       libc-2.21.so      [.] __strcoll_l
>                   |
>                   ---__strcoll_l
>                      mpsort_with_tmp
>                      mpsort_with_tmp
>                      sort_files
>                      main
>                      __libc_start_main
>                      _start
>                      0
>
>    $ perf report -g caller --no-children --stdio
>      ...
>      39.26%  ls       libc-2.21.so      [.] __strcoll_l
>                   |
>                   ---0
>                      _start
>                      __libc_start_main
>                      main
>                      sort_files
>                      mpsort_with_tmp
>                      mpsort_with_tmp
>                      __strcoll_l
>
> Tested on x86_64. The change is in generic code only,
> so it should not affect other archs. Still it would be
> nice to have some confirmation.. Wang Nan? ;-)
>
> It'd be nice to have this for libdw unwind as well,
> but it looks like it's out of reach for perf code.. Jan?
>
> Also available in:
>    git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>    perf/callchain_1


Thanks for notifying me about this. I have tested it in my environment.

It works well for me except a small behavior changing. Please see below.

Before applying these patch set:

# perf report --no-children --stdio --call-graph=callee
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
     96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
               |
               ---__vdso_gettimeofday
                  funcc
                  funcb
                  funca
                  main
                  __libc_start_main
                  _start

      3.38%  a.out    a.out             [.] funcc
               |
               ---funcc
                  |
                   --2.70%-- funcb
                             funca
                             main
                             __libc_start_main
                             _start

      0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
               |
               ---sched_clock
                  perf_event_nmi_handler
                  nmi_handle
      ...

And caller:

# ./perf report --no-children --stdio --call-graph=caller
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
     96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
               |
               ---__vdso_gettimeofday
                  funcc
                  funcb
                  funca
                  main
                  __libc_start_main
                  _start

      3.38%  a.out    a.out             [.] funcc
               |
               ---funcc
                  |
                   --2.70%-- funcb
                             funca
                             main
                             __libc_start_main
                             _start

      0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
               |
               ---return_from_execve
                  sys_execve
                  do_execveat_common.isra.27


The user code part of output are identical so I confirm the bug.

After applying this patchset:

# ./perf report --no-children --stdio --call-graph=callee
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
     96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
               |
               ---__vdso_gettimeofday
                  funcc
                  funcb
                  funca
                  main
                  __libc_start_main
                  _start

      3.38%  a.out    a.out             [.] funcc
               |
               ---funcc
                  |
                  |--2.70%-- funcb
                  |          funca
                  |          main
                  |          __libc_start_main
                  |          _start
                  |
                   --0.68%-- 0
      0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
               |
               ---sched_clock
                  perf_event_nmi_handler
      ...

And caller:

# ./perf report --no-children --stdio --call-graph=caller
# Overhead  Command  Shared Object     Symbol
# ........  .......  ................  .........................
#
     96.61%  a.out    [vdso]            [.] __vdso_gettimeofday
               |
               ---_start
                  __libc_start_main
                  main
                  funca
                  funcb
                  funcc
                  __vdso_gettimeofday

      3.38%  a.out    a.out             [.] funcc
               |
               |--2.70%-- _start
               |          __libc_start_main
               |          main
               |          funca
               |          funcb
               |          funcc
               |
                --0.68%-- 0
                          funcc

      0.02%  pref_re  [kernel.vmlinux]  [k] sched_clock
               |
               ---return_from_execve
                  sys_execve
     ...

It fixes the bug. However, do you see the extra "0.68%-- 0" in the tree?

I give a message on patch 2/3, please have a look. I think this change
would be okay for me if we treat the old behavior as a bug (for example:
sum of all branches not equal to the overhead of itself). However, the
original code explicitly avoid generating '0' entry so I think we
should make it clear.

Thank you.


> thanks,
> jirka
>
>
> Cc: Jan Kratochvil <jkratoch@...hat.com>
> ---
> Jiri Olsa (3):
>        perf tools: Move initial entry call into get_entries function
>        perf tools: Add callchain order support for libunwind DWARF unwinder
>        perf test: Add callchain order setup for DWARF unwinder test
>
>   tools/perf/tests/dwarf-unwind.c    | 22 +++++++++++++++++++---
>   tools/perf/util/unwind-libunwind.c | 60 +++++++++++++++++++++++++++++++++++++++---------------------
>   2 files changed, 58 insertions(+), 24 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/