[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7cjnYRTwGSWhgeQAdityphZYzmm-3MeV6WdmMEE0nDw+Tg@mail.gmail.com>
Date: Sat, 23 Feb 2019 13:56:35 +0900
From: Namhyung Kim <namhyung@...nel.org>
To: Jonas Rabenstein <jonas.rabenstein@...dium.uni-erlangen.de>
Cc: linux-perf-users <linux-perf-users@...r.kernel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
David Miller <davem@...emloft.net>,
Eric Saint-Etienne <eric.saint.etienne@...cle.com>,
Ingo Molnar <mingo@...hat.com>, Jiri Olsa <jolsa@...hat.com>,
Kim Phillips <kim.phillips@....com>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
Milian Wolff <milian.wolff@...b.com>,
Peter Zijlstra <peterz@...radead.org>,
Rob Gardner <rob.gardner@...cle.com>,
Sandipan Das <sandipan@...ux.ibm.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/3] Support inline symbols in callchains
On Fri, Feb 22, 2019 at 1:07 AM Jonas Rabenstein
<jonas.rabenstein@...dium.uni-erlangen.de> wrote:
>
> Hi,
> This patchset supersedes my previous attempt to add inline symbols to
> callchain of perf-script [1] by a more generic attempt to not hook in
> the output stage but directly into the callchain generation. By a matter
> of fact this adds those inline symbols automatically to other commands
> like perf-report.
> Additionally this fixes the regression reported by Jiri Olsa [2] that
> some entries from previous outputs had been vanished and now only new
> lines are added if symbols had been found.
>
> The integration for perf-report is not completely done as there is an
> issue if the root for an hist_entry as for inlined symbols there may be
> multiple instances (for each address-range) with the same name. But in
> util/sort.c:233 only the name is compared for inlined symbols. As a
> consequence the returned hist_entry may hold a reference to another
> instance for this inlined symbol (with another address range than
> requested) we later on fail with -ERANGE in __symbol__inc_addr_samples
> (util/annotate.c:857).
> This issue does still permit perf-report to be executed without any
> problems and the inlined symbols do show up but none of the samples is
> actually accounted to them but to the original symbol :(
>
> To further provide information what this changeset is doing, here is a
> script-session to show the differences in the output:
>
> [jonas@...s]$ git reset --hard v5.0-rc7; make -C tools/perf >/dev/null; \
> HEAD is now at a3b22b9f11d9 Linux 5.0-rc7
> [jonas@...s]$ git am *.patch; make -C tools/perf >/dev/null; \
> mv tools /perf/perf perf-new
> Applying: perf map: add function to lookup inline symbols
> Applying: perf machine: use map__inlines in append_inlines
> Applying: perf machine: add inline symbols to callchains
> [jonas@...s]$ cat test.c
> static int foo(int m){int r=0; for(int i=0;i<m;++i)r+=i; return r;}
> static int bar(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i); return r;}
> static int baz(int m){int r=0; for(int i=0;i<m;++i)r+=foo(i)*bar(i); return r;}
> int main() { return baz(421); }
> [jonas@...s]$ gcc -O2 -fno-omit-frame-pointer -g -o test test.c
> [jonas@...s]$ ./perf-new record --call-graph fp ./test.c
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (19 samples) ]
> [jonas@...s]$ time ./perf-old script >old.script
> real 0m0.039s
> user 0m0.012s
> sys 0m0.022s
> [jonas@...s]$ time ./perf-new script >new.script
> real 0m0.045s
> user 0m0.010s
> sys 0m0.032s
> [jonas@...s]$ diff -u old.script new.script
> --- old.script 2019-02-21 16:30:47.997194180 +0100
> +++ new.script 2019-02-21 16:30:51.757309790 +0100
> @@ -23,66 +23,101 @@
> 7f43b1148090 _start+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
>
> test 7579 1470536.968092: 354758 cycles:uppp:
> + 7f43b1148f4b elf_get_dynamic_info+0xab (inlined)
> 7f43b1148f4b _dl_start+0xab (/lib/x86_64-linux-gnu/ld-2.27.so)
> 7f43b1148098 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-2.27.so)
> [...]
>
> test 7579 1470536.969210: 1922435 cycles:uppp:
> + 563f0df9c53f foo+0x4f (inlined)
> + 563f0df9c53f bar+0x4f (inlined)
> + 563f0df9c53f baz+0x4f (inlined)
> 563f0df9c53f main+0x4f (/home/jonas/linux/test)
> 7f43b0d77b97 __libc_start_main+0xe7 (/lib/x86_64-linux-gnu/libc-2.27.so)
> 75e258d4c544155 [unknown] ([unknown])
> [jonas@...s]$ time ./perf-new script --no-inline >new.noinline.script
> real 0m0.035s
> user 0m0.012s
> sys 0m0.020s
> [jonas@...s]$ diff -u old.script new.noinline.script
> [jonas@...s]$ ./perf-old report --stdio --quiet | sed '/^$/Q'
> 88.80% 88.80% test test [.] main
> |
> ---0x75e258d4c544155
> __libc_start_main
> main
> [jonas@...s]$ ./perf-new report --stdio --quiet | sed '/^$/Q'
> 88.80% 88.80% test test [.] main
> |
> ---0x75e258d4c544155
> __libc_start_main
> main
> baz (inlined)
> bar (inlined)
> foo (inlined)
> [jonas@...s]$ ./perf-new report --stdio --quiet --no-inline | sed '/^$/Q'
> 88.80% 88.80% test test [.] main
> |
> ---0x75e258d4c544155
> __libc_start_main
> main
>
> I am still trying to find a way that in the new output of report the
> 88.80% 'Self' are not accounted to main itself but split up for the
> inlined baz, bar and foo symbols.
>
> I'm open to any help as well as feedback,
I guess you can set al->sym to a symbol of the last callchain entry in
the iter_add_single_cumulative_entry(). For annotation, al->addr
needs to be updated from the start of the inlined subroutine if DWARF
is used, but not sure for addr2line though.
Thanks,
Namhyung
Powered by blists - more mailing lists