[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1493380030-4683-3-git-send-email-mark.rutland@arm.com>
Date: Fri, 28 Apr 2017 12:47:10 +0100
From: Mark Rutland <mark.rutland@....com>
To: linux-kernel@...r.kernel.org
Cc: acme@...nel.org, alexander.shishkin@...ux.intel.com,
ganapatrao.kulkarni@...ium.com, mark.rutland@....com
Subject: [PATCH 2/2] perf stat: balance opening/closing of events
While perf-stat has its own create_perf_stat_counter() helper to open
events, dependent on target configuration, it uses perf_evlist__close()
to close events.
The common perf_evlist__{open,close}() helpers don't consider the target
configuration, and always evsel->cpus even where
create_perf_stat_counter() would have used an empty_cpu_map.
On some systems, the CPU PMU describes cpus under sysfs, and evsel->cpus
may be set even when we open per-thread events. For per-thread events,
we don't use evsel->cpus in create_perf_stat_counter(), though
perf_evlist__close() will. Thus we try to close more events than we
allocated and opened, resulting in segfaults when we go out-of-bounds:
$ perf stat -e armv8_pmuv3/cpu_cycles/ true
*** Error in `./old-perf': free(): invalid next size (fast): 0x00000000263a55c0 ***
Aborted
This problem was introduced by commit:
3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")
... prior to that commit, we closed events as we read them, using the
correct number of CPUs.
This patch fixes the problem by adding a new close_counters() routine
that mirrors create_perf_stat_counter(), ensuring that we always
correctly balance open/close.
Fixes: 3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")
Signed-off-by: Mark Rutland <mark.rutland@....com>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@...ium.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org
---
tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a935b50..ceedc0a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -364,6 +364,28 @@ static void read_counters(void)
}
}
+/*
+ * Close all evnt FDs we open in __run_perf_stat() and
+ * create_perf_stat_counter(), taking care to match the number of threads and CPUs.
+ *
+ * Note that perf_evlist__close(evsel_list) is not equivalent, as it doesn't
+ * take the target into account.
+ */
+static void close_counters(void)
+{
+ bool per_cpu = target__has_cpu(&target);
+ struct perf_evsel *evsel;
+
+ evlist__for_each_entry(evsel_list, evsel) {
+ if (per_cpu)
+ perf_evsel__close_per_cpu(evsel,
+ perf_evsel__cpus(evsel));
+ else
+ perf_evsel__close_per_thread(evsel,
+ evsel_list->threads);
+ }
+}
+
static void process_interval(void)
{
struct timespec ts, rs;
@@ -704,7 +726,7 @@ static int __run_perf_stat(int argc, const char **argv)
* group leaders.
*/
read_counters();
- perf_evlist__close(evsel_list);
+ close_counters();
return WEXITSTATUS(status);
}
--
1.9.1
Powered by blists - more mailing lists