[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c99925a7-6369-d4d6-5227-3b39f5aca8ed@gmail.com>
Date: Thu, 29 Jun 2017 01:27:34 +0900
From: Taeung Song <treeze.taeung@...il.com>
To: Milian Wolff <milian.wolff@...b.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
linux-kernel@...r.kernel.org,
Adrian Hunter <adrian.hunter@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
David Ahern <dsahern@...il.com>,
Jin Yao <yao.jin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Kim Phillips <kim.phillips@....com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Wang Nan <wangnan0@...wei.com>
Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the
new source code TUI view
On 06/28/2017 06:53 PM, Milian Wolff wrote:
> On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
>> Hi,
>>
>> The --source-only option and new source code TUI view can show
>> the result of performance analysis based on full source code per
>> symbol(function). (Namhyung Kim told me this idea and it was also requested
>> by others some time ago..)
>>
>> If someone wants to see the cause, he/she will need to dig into the asm.
>> But before that, looking at the source level can give a hint or clue
>> for the problem.
>>
>> For example, if target symbol is 'hex2u64' of util/util.c,
>> the output is like below.
>>
>> $ perf annotate --source-only --stdio -s hex2u64
>> Percent | Source code of util.c for cycles:ppp (42 samples)
>> -----------------------------------------------------------------
>> 0.00 : 354 * While we find nice hex chars, build a long_val.
>> 0.00 : 355 * Return number of chars processed.
>> 0.00 : 356 */
>> 0.00 : 357 int hex2u64(const char *ptr, u64 *long_val)
>> 2.38 : 358 {
>> 2.38 : 359 const char *p = ptr;
>> 0.00 : 360 *long_val = 0;
>> 0.00 : 361
>> 30.95 : 362 while (*p) {
>> 23.81 : 363 const int hex_val = hex(*p);
>> 0.00 : 364
>> 14.29 : 365 if (hex_val < 0)
>> 0.00 : 366 break;
>> 0.00 : 367
>> 26.19 : 368 *long_val = (*long_val << 4) | hex_val;
>> 0.00 : 369 p++;
>> 0.00 : 370 }
>> 0.00 : 371
>> 0.00 : 372 return p - ptr;
>> 0.00 : 373 }
>>
>> And I added many perf developers into Cc: because I want to listen to your
>> opinions about this new feature, if you don't mind.
>>
>> If you give some feedback, I'd appreciate it! :)
>
> Thanks Taeung,
>
> I requested this feature some time ago and it's really cool to see someone
> step up and implement it - much appreciated!
Thank you so much, Milian !! :)
>
> I just tested it out on my pet-example that leverages C++ instead of C:
>
> ~~~~~
> #include <complex>
> #include <cmath>
> #include <random>
> #include <iostream>
>
> using namespace std;
>
> int main()
> {
> uniform_real_distribution<double> uniform(-1E5, 1E5);
> default_random_engine engine;
> double s = 0;
> for (int i = 0; i < 10000000; ++i) {
> s += norm(complex<double>(uniform(engine), uniform(engine)));
> }
> cout << s << '\n';
> return 0;
> }
> ~~~~~
>
> Compile it with:
>
> g++ -O2 -g -std=c++11 test.cpp -o test
>
> Then record it with perf:
>
> perf record --call-graph dwarf ./test
>
> Then analyse it with `perf report`. You'll see one entry for main with
> something like:
>
> + 100.00% 39.69% cpp-inlining cpp-inlining [.] main
>
> Select it and annotate it, then switch to your new source-only view:
>
> main test.cpp
> │ 30
> │ 31 using namespace std;
> │ 32
> │ 33 int main()
> │+ 34 {
> │ 35 uniform_real_distribution<double> uniform(-1E5, 1E5);
> │ 36 default_random_engine engine;
> │+ 37 double s = 0;
> │+ 38 for (int i = 0; i < 10000000; ++i) {
> 4.88 │+ 39 s += norm(complex<double>(uniform(engine),
> uniform(engine)));
> │ 40 }
> │ 41 cout << s << '\n';
> │ 42 return 0;
> │+ 43 }
>
> Note: the line numbers are off b/c my file contains a file-header on-top.
> Ignore that.
>
> Note2: There is no column header shown, so it's unclear what the first column
> represents.
>
> Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate
> shows 4.88... What is that?
>
> What this shows, is that it's extremely important to visualize inclusive cost
> _and_ self cost in this view. Additionally, we need to account for inlining.
> Right now, we only see the self cost that is directly within main, I suspect.
> For C++ this is usually very misleading, and basically makes the annotate view
> completely useless for application-level profiling. If a second column would
> be added with the inclusive cost with the ability to drill down, then I could
> easily see myself using this view.
>
> I would appreciate if you could take this into account.
>
> Thanks a lot
>
>
Sure, I got it.
I'll investigate this weird case and recheck this patchset based on your
comments,
and then I'll reply again. :)
Thanks,
Taeung
Powered by blists - more mailing lists