[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f18c3b1e-ec2d-cfbc-6677-904d3f28f89c@linux.intel.com>
Date: Tue, 28 Apr 2020 16:29:28 +0800
From: "Jin, Yao" <yao.jin@...ux.intel.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
mingo@...hat.com, alexander.shishkin@...ux.intel.com,
Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v3 0/7] perf: Stream comparison
Hi Jiri,
On 4/27/2020 6:29 PM, Jiri Olsa wrote:
> On Mon, Apr 20, 2020 at 09:04:44AM +0800, Jin Yao wrote:
>> Sometimes, a small change in a hot function reducing the cycles of
>> this function, but the overall workload doesn't get faster. It is
>> interesting where the cycles are moved to.
>>
>> What it would like is to diff before/after streams. A stream we think
>> is a callchain which is aggregated by the branch records from perf
>> samples.
>
> I wonder we could use this on intel_pt trace.. like compare streams
> for given function call.. not sure that would be feasible, but might
> be good idea to write this in a generic way and not callchain specific
>
> jirka
>
Yes, that's a good idea. We should try to write the code in a generic way.
Thanks
Jin Yao
>>
>> By browsing the hot streams, we can understand the hot code path.
>> By comparing the cycles variation of same streams between old perf
>> data and new perf data, we can understand if the cycles are moved
>> to other codes.
>>
>> The before stream is the stream in perf.data.old. The after stream
>> is the stream in perf.data.
>>
>> Diffing before/after streams compares top N hottest streams between
>> two perf data files.
>>
>> If all entries of one stream in perf.data.old are fully matched with
>> all entries of another stream in perf.data, we think two streams
>> are matched, otherwise the streams are not matched.
>>
>> For example,
>>
>> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
>> -------------------------- --------------------------
>> main div.c:39 main div.c:39
>> main div.c:44 main div.c:44
>>
>> The above streams are matched and we can see for the same streams the
>> cycles (1) are equal and the callchain hit percents are slightly changed
>> (26.80% vs. 27.30%). That's expected.
>>
>> But that's not always true if source code is changed in perf.data
>> (e.g. div.c:39 is changed). If div.c:39 is changed, they are different
>> streams, we can't compare them. We will think the stream in perf.data
>> is a new stream.
>>
>> The challenge is how to identify the changed source lines. The basic
>> idea is to use linux command "diff" to compare the source file A and
>> source file A* line by line (assume file A is used in perf.data.old
>> and file A* is used in perf.data). According to "diff" output,
>> we can generate a source line mapping table.
>>
>> For example,
>>
>> Execute 'diff ./before/div.c ./after/div.c'
>>
>> 25c25
>> < i = rand() % 2;
>> ---
>> > i = rand() % 4;
>> 39c39
>> < for (i = 0; i < 2000000000; i++) {
>> ---
>> > for (i = 0; i < 20000000001; i++) {
>>
>> div.c (after -> before) lines mapping:
>> 0 -> 0
>> 1 -> 1
>> 2 -> 2
>> 3 -> 3
>> 4 -> 4
>> 5 -> 5
>> 6 -> 6
>> 7 -> 7
>> 8 -> 8
>> 9 -> 9
>> ...
>> 24 -> 24
>> 25 -> -1
>> 26 -> 26
>> 27 -> 27
>> 28 -> 28
>> 29 -> 29
>> 30 -> 30
>> 31 -> 31
>> 32 -> 32
>> 33 -> 33
>> 34 -> 34
>> 35 -> 35
>> 36 -> 36
>> 37 -> 37
>> 38 -> 38
>> 39 -> -1
>> 40 -> 40
>> ...
>>
>> From the table, we can easily know div.c:39 is source line changed.
>> (it's mapped to -1). So following two streams are not matched.
>>
>> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
>> -------------------------- --------------------------
>> main div.c:39 main div.c:39
>> main div.c:44 main div.c:44
>>
>> Now let's see examples.
>>
>> perf record -b ... Generate perf.data.old with branch data
>> perf record -b ... Generate perf.data with branch data
>> perf diff --stream
>>
>> [ Matched hot chains between old perf data and new perf data) ]
>>
>> hot chain pair 1:
>> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
>> --------------------------- --------------------------
>> main div.c:39 main div.c:39
>> main div.c:44 main div.c:44
>>
>> hot chain pair 2:
>> cycles: 35, hits: 21.43% cycles: 33, hits: 19.37%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>> __random_r random_r.c:357 __random_r random_r.c:357
>> __random random.c:293 __random random.c:293
>> __random random.c:293 __random random.c:293
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:288 __random random.c:288
>> rand rand.c:27 rand rand.c:27
>> rand rand.c:26 rand rand.c:26
>> rand@plt rand@plt
>> rand@plt rand@plt
>> compute_flag div.c:25 compute_flag div.c:25
>> compute_flag div.c:22 compute_flag div.c:22
>> main div.c:40 main div.c:40
>> main div.c:40 main div.c:40
>> main div.c:39 main div.c:39
>>
>> hot chain pair 3:
>> cycles: 18, hits: 6.10% cycles: 19, hits: 6.51%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>> __random_r random_r.c:357 __random_r random_r.c:357
>> __random random.c:293 __random random.c:293
>> __random random.c:293 __random random.c:293
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:288 __random random.c:288
>> rand rand.c:27 rand rand.c:27
>> rand rand.c:26 rand rand.c:26
>> rand@plt rand@plt
>> rand@plt rand@plt
>> compute_flag div.c:25 compute_flag div.c:25
>> compute_flag div.c:22 compute_flag div.c:22
>> main div.c:40 main div.c:40
>>
>> hot chain pair 4:
>> cycles: 9, hits: 5.95% cycles: 8, hits: 5.03%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>>
>> [ Hot chains in old perf data but source line changed (*) in new perf data ]
>>
>> [ Hot chains in old perf data only ]
>>
>> hot chain 1:
>> cycles: 2, hits: 4.08%
>> --------------------------
>> main div.c:42
>> compute_flag div.c:28
>>
>> [ Hot chains in new perf data only ]
>>
>> hot chain 1:
>> cycles: 36, hits: 3.36%
>> --------------------------
>> __random_r random_r.c:357
>> __random random.c:293
>> __random random.c:293
>> __random random.c:291
>> __random random.c:291
>> __random random.c:291
>> __random random.c:288
>> rand rand.c:27
>> rand rand.c:26
>> rand@plt
>> rand@plt
>> compute_flag div.c:25
>> compute_flag div.c:22
>> main div.c:40
>> main div.c:40
>>
>> If we enable the source line comparison option, the output may be different.
>>
>> perf record -b ... Generate perf.data.old with branch data
>> perf record -b ... Generate perf.data with branch data
>> perf diff --stream --before ./before --after ./after
>>
>> [ Matched hot chains between old perf data and new perf data) ]
>>
>> hot chain pair 1:
>> cycles: 18, hits: 6.10% cycles: 19, hits: 6.51%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>> __random_r random_r.c:357 __random_r random_r.c:357
>> __random random.c:293 __random random.c:293
>> __random random.c:293 __random random.c:293
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:288 __random random.c:288
>> rand rand.c:27 rand rand.c:27
>> rand rand.c:26 rand rand.c:26
>> rand@plt rand@plt
>> rand@plt rand@plt
>> compute_flag div.c:25 compute_flag div.c:25
>> compute_flag div.c:22 compute_flag div.c:22
>> main div.c:40 main div.c:40
>>
>> hot chain pair 2:
>> cycles: 9, hits: 5.95% cycles: 8, hits: 5.03%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>>
>> [ Hot chains in old perf data but source line changed (*) in new perf data ]
>>
>> hot chain pair 1:
>> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
>> --------------------------- --------------------------
>> main div.c:39 main div.c:39*
>> main div.c:44 main div.c:44
>>
>> hot chain pair 2:
>> cycles: 35, hits: 21.43% cycles: 33, hits: 19.37%
>> --------------------------- --------------------------
>> __random_r random_r.c:360 __random_r random_r.c:360
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:388 __random_r random_r.c:388
>> __random_r random_r.c:380 __random_r random_r.c:380
>> __random_r random_r.c:357 __random_r random_r.c:357
>> __random random.c:293 __random random.c:293
>> __random random.c:293 __random random.c:293
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:291 __random random.c:291
>> __random random.c:288 __random random.c:288
>> rand rand.c:27 rand rand.c:27
>> rand rand.c:26 rand rand.c:26
>> rand@plt rand@plt
>> rand@plt rand@plt
>> compute_flag div.c:25 compute_flag div.c:25
>> compute_flag div.c:22 compute_flag div.c:22
>> main div.c:40 main div.c:40
>> main div.c:40 main div.c:40
>> main div.c:39 main div.c:39*
>>
>> [ Hot chains in old perf data only ]
>>
>> hot chain 1:
>> cycles: 2, hits: 4.08%
>> --------------------------
>> main div.c:42
>> compute_flag div.c:28
>>
>> [ Hot chains in new perf data only ]
>>
>> hot chain 1:
>> cycles: 36, hits: 3.36%
>> --------------------------
>> __random_r random_r.c:357
>> __random random.c:293
>> __random random.c:293
>> __random random.c:291
>> __random random.c:291
>> __random random.c:291
>> __random random.c:288
>> rand rand.c:27
>> rand rand.c:26
>> rand@plt
>> rand@plt
>> compute_flag div.c:25
>> compute_flag div.c:22
>> main div.c:40
>> main div.c:40
>>
>> Now we can see, following streams pair is moved to another section
>> "[ Hot chains in old perf data but source line changed (*) in new perf data ]"
>>
>> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
>> --------------------------- --------------------------
>> main div.c:39 main div.c:39*
>> main div.c:44 main div.c:44
>>
>> v3:
>> ---
>> v2 has 14 patches, it's hard to review.
>> v3 is only 7 patches for basic stream comparison.
>>
>> Jin Yao (7):
>> perf util: Create source line mapping table
>> perf util: Create streams for managing top N hottest callchains
>> perf util: Return per-event callchain streams
>> perf util: Compare two streams
>> perf util: Calculate the sum of all streams hits
>> perf util: Report hot streams
>> perf diff: Support hot streams comparison
>>
>> tools/perf/Documentation/perf-diff.txt | 14 +
>> tools/perf/builtin-diff.c | 170 +++++++-
>> tools/perf/util/Build | 1 +
>> tools/perf/util/callchain.c | 495 ++++++++++++++++++++++
>> tools/perf/util/callchain.h | 32 ++
>> tools/perf/util/srclist.c | 555 +++++++++++++++++++++++++
>> tools/perf/util/srclist.h | 65 +++
>> 7 files changed, 1319 insertions(+), 13 deletions(-)
>> create mode 100644 tools/perf/util/srclist.c
>> create mode 100644 tools/perf/util/srclist.h
>>
>> --
>> 2.17.1
>>
>
Powered by blists - more mailing lists