lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1445295269-5672-2-git-send-email-andi@firstfloor.org>
Date:	Mon, 19 Oct 2015 15:54:29 -0700
From:	Andi Kleen <andi@...stfloor.org>
To:	x86@...nel.org
Cc:	peterz@...radead.org, linux-kernel@...r.kernel.org,
	Andi Kleen <ak@...ux.intel.com>
Subject: [PATCH 2/2] x86, perf: Optimize stack walk user accesses

From: Andi Kleen <ak@...ux.intel.com>

Change the perf user stack walking to use the new __copy_from_user_nmi,
and split each access into word sized transfer sizes. This allows
to inline the complete access and optimize it all into a single load.

The main advantage is that this avoids the overhead of double page faults.
When normal copy_from_user fails it reexecutes the copy to compute
an accurate number of non copied bytes. This leads to executing
the expensive page fault twice.

While walking stacks having a fault at some point is relatively
common (typically when some part of the program isn't compiled
with frame pointers), so this is a large overhead.

With the optimized copies we avoid this problem because they
only do all accesses once. And of course they're much faster too
when the access does not fault because they're just single instructions
instead of complex function calls.

While profiling a kernel build with -g, the patch brings down
the average time of the PMI handler from 966ns to 552ns (-43%)

Signed-off-by: Andi Kleen <ak@...ux.intel.com>
---
 arch/x86/kernel/cpu/perf_event.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 4562cf0..15bf8b3 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2255,7 +2255,12 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
 		frame.next_frame     = 0;
 		frame.return_address = 0;
 
-		bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
+		if (!access_ok(VERIFY_READ, fp, 8))
+			break;
+		bytes = __copy_from_user_nmi(&frame.next_frame, fp, 4);
+		if (bytes != 0)
+			break;
+		bytes = __copy_from_user_nmi(&frame.return_address, fp+4, 4);
 		if (bytes != 0)
 			break;
 
@@ -2307,7 +2312,13 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 		frame.next_frame	     = NULL;
 		frame.return_address = 0;
 
-		bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
+		if (!access_ok(VERIFY_READ, fp, 16))
+			break;
+
+		bytes = __copy_from_user_nmi(&frame.next_frame, fp, 8);
+		if (bytes != 0)
+			break;
+		bytes = __copy_from_user_nmi(&frame.return_address, fp+8, 8);
 		if (bytes != 0)
 			break;
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ