[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0bb78298-1f27-c535-2468-2c8abd5b31aa@linux.intel.com>
Date: Tue, 13 Oct 2020 12:25:09 +0800
From: "Jin, Yao" <yao.jin@...ux.intel.com>
To: acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
mingo@...hat.com, alexander.shishkin@...ux.intel.com
Cc: Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v8 0/7] perf: Stream comparison
Hi Jiri, Hi Arnaldo,
How about v8 series? V6 got ACK from Jiri and I updated the series to v8 according to Arnaldo's
comments. Please let me know if there are still some issues for this version then I can continue
improving the patchset.
Thanks
Jin Yao
On 10/9/2020 10:28 AM, Jin Yao wrote:
> Sometimes, a small change in a hot function reducing the cycles of
> this function, but the overall workload doesn't get faster. It is
> interesting where the cycles are moved to.
>
> What it would like is to diff before/after streams. The stream is the
> branch history which is aggregated by the branch records from perf
> samples. For example, the callchains aggregated from the branch records.
> By browsing the hot stream, we can understand the hot code path.
>
> By browsing the hot streams, we can understand the hot code path.
> By comparing the cycles variation of same streams between old perf
> data and new perf data, we can understand if the cycles are moved
> to other codes.
>
> The before stream is the stream in perf.data.old. The after stream
> is the stream in perf.data.
>
> Diffing before/after streams compares top N hottest streams between
> two perf data files.
>
> If all entries of one stream in perf.data.old are fully matched with
> all entries of another stream in perf.data, we think two streams
> are matched, otherwise the streams are not matched.
>
> For example,
>
> cycles: 1, hits: 26.80% cycles: 1, hits: 27.30%
> -------------------------- --------------------------
> main div.c:39 main div.c:39
> main div.c:44 main div.c:44
>
> The above streams are matched and we can see for the same streams the
> cycles (1) are equal and the callchain hit percents are slightly changed
> (26.80% vs. 27.30%). That's expected.
>
> Now let's see example.
>
> perf record -b ... Generate perf.data.old with branch data
> perf record -b ... Generate perf.data with branch data
> perf diff --stream
>
> [ Matched hot streams ]
>
> hot chain pair 1:
> cycles: 1, hits: 27.77% cycles: 1, hits: 9.24%
> --------------------------- --------------------------
> main div.c:39 main div.c:39
> main div.c:44 main div.c:44
>
> hot chain pair 2:
> cycles: 34, hits: 20.06% cycles: 27, hits: 16.98%
> --------------------------- --------------------------
> __random_r random_r.c:360 __random_r random_r.c:360
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:380 __random_r random_r.c:380
> __random_r random_r.c:357 __random_r random_r.c:357
> __random random.c:293 __random random.c:293
> __random random.c:293 __random random.c:293
> __random random.c:291 __random random.c:291
> __random random.c:291 __random random.c:291
> __random random.c:291 __random random.c:291
> __random random.c:288 __random random.c:288
> rand rand.c:27 rand rand.c:27
> rand rand.c:26 rand rand.c:26
> rand@plt rand@plt
> rand@plt rand@plt
> compute_flag div.c:25 compute_flag div.c:25
> compute_flag div.c:22 compute_flag div.c:22
> main div.c:40 main div.c:40
> main div.c:40 main div.c:40
> main div.c:39 main div.c:39
>
> hot chain pair 3:
> cycles: 9, hits: 4.48% cycles: 6, hits: 4.51%
> --------------------------- --------------------------
> __random_r random_r.c:360 __random_r random_r.c:360
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:388 __random_r random_r.c:388
> __random_r random_r.c:380 __random_r random_r.c:380
>
> [ Hot streams in old perf data only ]
>
> hot chain 1:
> cycles: 18, hits: 6.75%
> --------------------------
> __random_r random_r.c:360
> __random_r random_r.c:388
> __random_r random_r.c:388
> __random_r random_r.c:380
> __random_r random_r.c:357
> __random random.c:293
> __random random.c:293
> __random random.c:291
> __random random.c:291
> __random random.c:291
> __random random.c:288
> rand rand.c:27
> rand rand.c:26
> rand@plt
> rand@plt
> compute_flag div.c:25
> compute_flag div.c:22
> main div.c:40
>
> hot chain 2:
> cycles: 29, hits: 2.78%
> --------------------------
> compute_flag div.c:22
> main div.c:40
> main div.c:40
> main div.c:39
>
> [ Hot streams in new perf data only ]
>
> hot chain 1:
> cycles: 4, hits: 4.54%
> --------------------------
> main div.c:42
> compute_flag div.c:28
>
> hot chain 2:
> cycles: 5, hits: 3.51%
> --------------------------
> main div.c:39
> main div.c:44
> main div.c:42
> compute_flag div.c:28
>
> v8:
> ---
> Rebase to perf/core
>
> v7:
> ---
> Create a new struct evlist_streams which contains ev_streams and
> nr_evsel, so we don't need to pass nr_evsel in stream related functions.
>
> Rename functions for better coding style.
>
> v6:
> ---
> Rebase to perf/core
>
> v5:
> ---
> 1. Remove enum stream_type
> 2. Rebase to perf/core
>
> v4:
> ---
> The previous version is too big and very hard for review.
>
> 1. v4 removes the code which supports the source line mapping
> table and remove the source line based comparison. Now we
> only supports the basic functionality of stream comparison.
>
> 2. Refactor the code in a generic way.
>
> v3:
> ---
> v2 has 14 patches, it's hard to review.
> v3 is only 7 patches for basic stream comparison.
>
> Jin Yao (7):
> perf util: Create streams
> perf util: Get the evsel_streams by evsel_idx
> perf util: Compare two streams
> perf util: Link stream pair
> perf util: Calculate the sum of total streams hits
> perf util: Report hot streams
> perf diff: Support hot streams comparison
>
> tools/perf/Documentation/perf-diff.txt | 4 +
> tools/perf/builtin-diff.c | 119 ++++++++-
> tools/perf/util/Build | 1 +
> tools/perf/util/callchain.c | 99 +++++++
> tools/perf/util/callchain.h | 9 +
> tools/perf/util/stream.c | 342 +++++++++++++++++++++++++
> tools/perf/util/stream.h | 41 +++
> 7 files changed, 602 insertions(+), 13 deletions(-)
> create mode 100644 tools/perf/util/stream.c
> create mode 100644 tools/perf/util/stream.h
>
Powered by blists - more mailing lists