[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190213114716.63972-2-alexander.shishkin@linux.intel.com>
Date: Wed, 13 Feb 2019 13:47:15 +0200
From: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
jolsa@...hat.com,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Subject: [PATCH v0 1/2] perf: Add an option to ask for high order allocations for AUX buffers
Currently, the AUX buffer allocator will use high-order allocations
for PMUs that don't support hardware scatter-gather chaining to ensure
large contiguous blocks of pages, and always use an array of single
pages otherwise.
There is, however, a tangible performance benefit in using larger chunks
of contiguous memory even in the latter case, that comes from not having
to fetch the next page's address at every page boundary. In particular,
a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
penalty with a single multi-page output region in snapshot mode (no PMI)
than with multiple single-page output regions, from ~6% down to ~4%. For
the snapshot mode it does make a difference as it is intended to run over
long periods of time.
Following the above justification, add an attribute bit to ask for a
high-order AUX allocation. To prevent an unprivileged user from using up
the higher orders of the page allocator, require CAP_SYS_ADMIN for this
option.
Signed-off-by: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
---
include/uapi/linux/perf_event.h | 3 ++-
kernel/events/core.c | 3 +++
kernel/events/ring_buffer.c | 3 ++-
3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7198ddd0c6b1..04726b5729c8 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -374,7 +374,8 @@ struct perf_event_attr {
namespaces : 1, /* include namespaces data */
ksymbol : 1, /* include ksymbol events */
bpf_event : 1, /* include bpf events */
- __reserved_1 : 33;
+ aux_highorder : 1, /* use high order allocations for AUX data */
+ __reserved_1 : 32;
union {
__u32 wakeup_events; /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5aeb4c74fb99..ba95398505c5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10688,6 +10688,9 @@ SYSCALL_DEFINE5(perf_event_open,
perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
return -EACCES;
+ if (attr.aux_highorder && !capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
/*
* In cgroup mode, the pid argument is used to pass the fd
* opened to the cgroup directory in cgroupfs. The cpu argument
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 70ae2422cbaf..72b7380deb0a 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -603,7 +603,8 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
if (!has_aux(event))
return -EOPNOTSUPP;
- if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG) {
+ if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG ||
+ event->attr.aux_highorder) {
/*
* We need to start with the max_order that fits in nr_pages,
* not the other way around, hence ilog2() and not get_order.
--
2.20.1
Powered by blists - more mailing lists