[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <158755782945.28353.13339923488972169310.tip-bot2@tip-bot2>
Date: Wed, 22 Apr 2020 12:17:09 -0000
From: "tip-bot2 for Kan Liang" <tip-bot2@...utronix.de>
To: linux-tip-commits@...r.kernel.org
Cc: Kan Liang <kan.liang@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Alexey Budankov <alexey.budankov@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Michael Ellerman <mpe@...erman.id.au>,
Namhyung Kim <namhyung@...nel.org>,
Pavel Gerasimov <pavel.gerasimov@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Ravi Bangoria <ravi.bangoria@...ux.ibm.com>,
Stephane Eranian <eranian@...gle.com>,
Vitaly Slobodskoy <vitaly.slobodskoy@...el.com>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
x86 <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: [tip: perf/core] perf hist: Add fast path for duplicate entries check
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 12e89e65f446476951f42aedeef56b6bd6f7f1e6
Gitweb: https://git.kernel.org/tip/12e89e65f446476951f42aedeef56b6bd6f7f1e6
Author: Kan Liang <kan.liang@...ux.intel.com>
AuthorDate: Thu, 19 Mar 2020 13:25:17 -07:00
Committer: Arnaldo Carvalho de Melo <acme@...hat.com>
CommitterDate: Sat, 18 Apr 2020 09:05:01 -03:00
perf hist: Add fast path for duplicate entries check
Perf checks the duplicate entries in a callchain before adding an entry.
However the check is very slow especially with deeper call stack.
Almost ~50% elapsed time of perf report is spent on the check when the
call stack is always depth of 32.
The hist_entry__cmp() is used to compare the new entry with the old
entries. It will go through all the available sorts in the sort_list,
and call the specific cmp of each sort, which is very slow.
Actually, for most cases, there are no duplicate entries in callchain.
The symbols are usually different. It's much faster to do a quick check
for symbols first. Only do the full cmp when the symbols are exactly the
same.
The quick check is only to check symbols, not dso. Export
_sort__sym_cmp.
$ perf record --call-graph lbr ./tchain_edit_64
Without the patch
$time perf report --stdio
real 0m21.142s
user 0m21.110s
sys 0m0.033s
With the patch
$time perf report --stdio
real 0m10.977s
user 0m10.948s
sys 0m0.027s
Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
Acked-by: Jiri Olsa <jolsa@...hat.com>
Cc: Adrian Hunter <adrian.hunter@...el.com>
Cc: Alexey Budankov <alexey.budankov@...ux.intel.com>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@...aro.org>
Cc: Michael Ellerman <mpe@...erman.id.au>
Cc: Namhyung Kim <namhyung@...nel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ravi Bangoria <ravi.bangoria@...ux.ibm.com>
Cc: Stephane Eranian <eranian@...gle.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@...el.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-18-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>
---
tools/perf/util/hist.c | 23 +++++++++++++++++++++++
tools/perf/util/sort.c | 2 +-
tools/perf/util/sort.h | 2 ++
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 283a69f..c2550db 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1070,6 +1070,20 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
return fill_callchain_info(al, node, iter->hide_unresolved);
}
+static bool
+hist_entry__fast__sym_diff(struct hist_entry *left,
+ struct hist_entry *right)
+{
+ struct symbol *sym_l = left->ms.sym;
+ struct symbol *sym_r = right->ms.sym;
+
+ if (!sym_l && !sym_r)
+ return left->ip != right->ip;
+
+ return !!_sort__sym_cmp(sym_l, sym_r);
+}
+
+
static int
iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
struct addr_location *al)
@@ -1096,6 +1110,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
};
int i;
struct callchain_cursor cursor;
+ bool fast = hists__has(he_tmp.hists, sym);
callchain_cursor_snapshot(&cursor, &callchain_cursor);
@@ -1106,6 +1121,14 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
* It's possible that it has cycles or recursive calls.
*/
for (i = 0; i < iter->curr; i++) {
+ /*
+ * For most cases, there are no duplicate entries in callchain.
+ * The symbols are usually different. Do a quick check for
+ * symbols first.
+ */
+ if (fast && hist_entry__fast__sym_diff(he_cache[i], &he_tmp))
+ continue;
+
if (hist_entry__cmp(he_cache[i], &he_tmp) == 0) {
/* to avoid calling callback function */
iter->he = NULL;
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f14cc72..dc15ddc 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -237,7 +237,7 @@ static int64_t _sort__addr_cmp(u64 left_ip, u64 right_ip)
return (int64_t)(right_ip - left_ip);
}
-static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
+int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
{
if (!sym_l || !sym_r)
return cmp_null(sym_l, sym_r);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index cfa6ac6..66d39c4 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -311,5 +311,7 @@ int64_t
sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right);
int64_t
sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
+int64_t
+_sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r);
char *hist_entry__srcline(struct hist_entry *he);
#endif /* __PERF_SORT_H */
Powered by blists - more mailing lists