[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231007021326.4156714-1-irogers@google.com>
Date: Fri, 6 Oct 2023 19:13:19 -0700
From: Ian Rogers <irogers@...gle.com>
To: Suzuki K Poulose <suzuki.poulose@....com>,
Mike Leach <mike.leach@...aro.org>,
James Clark <james.clark@....com>,
Leo Yan <leo.yan@...aro.org>,
John Garry <john.g.garry@...cle.com>,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Thomas Richter <tmricht@...ux.ibm.com>,
Ravi Bangoria <ravi.bangoria@....com>,
Kajol Jain <kjain@...ux.ibm.com>,
Jing Zhang <renyu.zj@...ux.alibaba.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Yang Jihong <yangjihong1@...wei.com>,
coresight@...ts.linaro.org, linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH v1 0/7] PMU performance improvements
Performance improvements to pmu scanning by holding onto the
event/metric tables for a cpuid (avoid regular expression comparisons)
and by lazily computing the default perf_event_attr for a PMU.
Before
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 251.990 usec (+- 4.009 usec)
Average PMU scanning took: 3222.460 usec (+- 211.234 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 260.120 usec (+- 7.905 usec)
Average PMU scanning took: 3228.995 usec (+- 211.196 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 252.310 usec (+- 3.980 usec)
Average PMU scanning took: 3220.675 usec (+- 210.844 usec)
After:
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 28.530 usec (+- 0.602 usec)
Average PMU scanning took: 275.725 usec (+- 18.253 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 28.720 usec (+- 0.446 usec)
Average PMU scanning took: 271.015 usec (+- 18.762 usec)
% Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 31.040 usec (+- 0.612 usec)
Average PMU scanning took: 267.340 usec (+- 17.209 usec)
Measuring the pmu-scan benchmark on a Tigerlake laptop: core PMU
scanning is reduced to 11.5% of the previous execution time, all PMU
scanning is reduced to 8.4% of the previous execution time. There is a
4.3% reduction in openat system calls.
Ian Rogers (7):
perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_init
perf intel-pt: Move PMU initialization from default config code
perf arm-spe: Move PMU initialization from default config code
perf pmu: Const-ify file APIs
perf pmu: Const-ify perf_pmu__config_terms
perf pmu-events: Remember the events and metrics table
perf pmu: Lazily compute default config
tools/perf/arch/arm/util/cs-etm.c | 13 ++------
tools/perf/arch/arm/util/pmu.c | 10 +++---
tools/perf/arch/arm64/util/arm-spe.c | 48 +++++++++++++---------------
tools/perf/arch/s390/util/pmu.c | 3 +-
tools/perf/arch/x86/util/intel-pt.c | 27 +++++++---------
tools/perf/arch/x86/util/pmu.c | 6 ++--
tools/perf/pmu-events/jevents.py | 48 ++++++++++++++++------------
tools/perf/util/arm-spe.h | 4 ++-
tools/perf/util/cs-etm.h | 2 +-
tools/perf/util/intel-pt.h | 3 +-
tools/perf/util/parse-events.c | 12 +++----
tools/perf/util/pmu.c | 39 +++++++++++-----------
tools/perf/util/pmu.h | 18 ++++++-----
tools/perf/util/python.c | 2 +-
14 files changed, 117 insertions(+), 118 deletions(-)
--
2.42.0.609.gbb76f46606-goog
Powered by blists - more mailing lists