lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190213114716.63972-2-alexander.shishkin@linux.intel.com>
Date:   Wed, 13 Feb 2019 13:47:15 +0200
From:   Alexander Shishkin <alexander.shishkin@...ux.intel.com>
To:     Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Arnaldo Carvalho de Melo <acme@...hat.com>
Cc:     Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        jolsa@...hat.com,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Subject: [PATCH v0 1/2] perf: Add an option to ask for high order allocations for AUX buffers

Currently, the AUX buffer allocator will use high-order allocations
for PMUs that don't support hardware scatter-gather chaining to ensure
large contiguous blocks of pages, and always use an array of single
pages otherwise.

There is, however, a tangible performance benefit in using larger chunks
of contiguous memory even in the latter case, that comes from not having
to fetch the next page's address at every page boundary. In particular,
a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
penalty with a single multi-page output region in snapshot mode (no PMI)
than with multiple single-page output regions, from ~6% down to ~4%. For
the snapshot mode it does make a difference as it is intended to run over
long periods of time.

Following the above justification, add an attribute bit to ask for a
high-order AUX allocation. To prevent an unprivileged user from using up
the higher orders of the page allocator, require CAP_SYS_ADMIN for this
option.

Signed-off-by: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
---
 include/uapi/linux/perf_event.h | 3 ++-
 kernel/events/core.c            | 3 +++
 kernel/events/ring_buffer.c     | 3 ++-
 3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7198ddd0c6b1..04726b5729c8 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -374,7 +374,8 @@ struct perf_event_attr {
 				namespaces     :  1, /* include namespaces data */
 				ksymbol        :  1, /* include ksymbol events */
 				bpf_event      :  1, /* include bpf events */
-				__reserved_1   : 33;
+				aux_highorder  :  1, /* use high order allocations for AUX data */
+				__reserved_1   : 32;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5aeb4c74fb99..ba95398505c5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10688,6 +10688,9 @@ SYSCALL_DEFINE5(perf_event_open,
 	    perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
+	if (attr.aux_highorder && !capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
 	/*
 	 * In cgroup mode, the pid argument is used to pass the fd
 	 * opened to the cgroup directory in cgroupfs. The cpu argument
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 70ae2422cbaf..72b7380deb0a 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -603,7 +603,8 @@ int rb_alloc_aux(struct ring_buffer *rb, struct perf_event *event,
 	if (!has_aux(event))
 		return -EOPNOTSUPP;
 
-	if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG) {
+	if (event->pmu->capabilities & PERF_PMU_CAP_AUX_NO_SG ||
+	    event->attr.aux_highorder) {
 		/*
 		 * We need to start with the max_order that fits in nr_pages,
 		 * not the other way around, hence ilog2() and not get_order.
-- 
2.20.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ