lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 24 Mar 2017 22:00:30 +0900
From:   Masami Hiramatsu <mhiramat@...nel.org>
To:     Kim Phillips <kim.phillips@....com>
Cc:     Will Deacon <will.deacon@....com>, He Kuang <hekuang@...wei.com>,
        Wang Nan <wangnan0@...wei.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Mark Rutland <mark.rutland@....com>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [BUG?] perf: dwarf unwind doesn't work correctly on aarch64

On Thu, 23 Mar 2017 22:24:01 -0500
Kim Phillips <kim.phillips@....com> wrote:

> On Thu, 23 Feb 2017 16:50:18 +0900
> Masami Hiramatsu <mhiramat@...nel.org> wrote:
> 
> [sorry for the delay, I just saw this]
> 
> > perf record -g dwarf (and perf report) doesn't show correct callchain
> > on aarch64. Here is how to reproduce it.
> ...
> > # Samples: 6K of event 'cpu-clock:u'
> > # Event count (approx.): 1623750000
> > #
> > # Children      Self  Command  Shared Object  Symbol                    
> > # ........  ........  .......  .............  ..........................
> > #
> >     17.21%    17.21%  main     main           [.] func2
> >             |
> >             ---func2
> > 
> >     17.09%    17.09%  main     main           [.] func1
> >             |
> >             ---func1
> > 
> >     16.67%    16.67%  main     main           [.] main
> >             |
> >             ---main
> > .....
> > 
> > So, as you can see, the call graph reported each function has been
> > called from itself. If I report it with fp as below, perf reported
> > correct callgraph.
> ...
> > I guess there is a bug in libunwind on aarch64 or we missed to pass
> > the stack data to libunwind. (BTW, it works correctly on arm32)
> 
> Trying to replicate this on a debian 9 ("stretch") arm64 box:

I'm using debian 8 ("jessie"), but I can try debian 9 too.

> Building acme's 'perf/urgent' branch (currently with the tag
> perf-urgent-for-mingo-4.11-20170317), natively (cd tools; make clean;
> make DEBUG=5 -C perf) shows this system has unwind support:
> 
> Auto-detecting system features:
> ...                         dwarf: [ on  ]
> ...            dwarf_getlocations: [ on  ]
> ...                         glibc: [ on  ]
> ...                          gtk2: [ on  ]
> ...                      libaudit: [ on  ]
> ...                        libbfd: [ on  ]
> ...                        libelf: [ on  ]
> ...                       libnuma: [ on  ]
> ...        numa_num_possible_cpus: [ on  ]
> ...                       libperl: [ OFF ]
> ...                     libpython: [ on  ]
> ...                      libslang: [ on  ]
> ...                     libcrypto: [ on  ]
> ...                     libunwind: [ on  ]
> ...            libdw-dwarf-unwind: [ on  ]
> ...                          zlib: [ on  ]
> ...                          lzma: [ on  ]
> ...                     get_cpuid: [ OFF ]
> ...                           bpf: [ on  ]
> 
> for which an apt search unwind returns the version:
> 
> libunwind-dev/testing,now 1.1-4.1 arm64 [installed]
>   library to determine the call-chain of a program - development
> libunwind8/testing,now 1.1-4.1 arm64 [installed,automatic]
>   library to determine the call-chain of a program - runtime

I've tried the same version and also tried with 1.2 and both not working.

> 
> continuing, and ignoring the no debug_frame support perf configure
> mentions:
> 
> Makefile.config:421: No debug_frame support found in libunwind-aarch64
> Makefile.config:480: No debug_frame support found in libunwind

Hmm, this seems --call-graph dwarf may not use debuginfo, right?

> $ ./perf --version
> perf version 4.10.rc4.ge7ede72
> $ gcc --version
> gcc (Debian 6.3.0-6) 6.3.0 20170205
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> 
> $ gcc -O0 -ggdb3 -funwind-tables -o main main.c
> $ ./perf record -g --call-graph dwarf,1024 -e cpu-clock:u -o /tmp/perf.data -- ./main
> ^C[ perf record: Woken up 121 times to write data ]
> [ perf record: Captured and wrote 30.154 MB /tmp/perf.data (22975 samples) ]
> 
> $ ./perf --no-pager report -i /tmp/perf.data --stdio
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 22K of event 'cpu-clock:u'
> # Event count (approx.): 5743750000
> #
> # Children      Self  Command  Shared Object  Symbol               
> # ........  ........  .......  .............  .....................
> #
>    100.00%     8.14%  main     main           [.] main
>             |          
>             |--91.86%--main
>             |          func0
>             |          |          
>             |           --76.41%--func1
>             |                     |          
>             |                      --60.82%--func2
>             |                                |          
>             |                                 --45.31%--func3
>             |                                           |          
>             |                                            --30.17%--func4
>             |                                                      |          
>             |                                                       --15.04%--func
>             |          
>              --8.14%--__libc_start_main
>                        main
> ...
> 
> which looks like it should, i.e., I can't reproduce.

Sound good news! I'll try to test again on debian 9.

> 
> You mentioned you're using the 'latest' sources for libunwind, etc.,
> but can you provide more exact details like commit IDs, and what, if
> anything, is being cross-built vs. native?

I'm using qemu-user-static for install rootfs (by de-bootstrap) and perf.
For running the test code and perf, I'm currently using qemu-system-arm64.
So, it's a kind of native build.

Thank you!

-- 
Masami Hiramatsu <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ