lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241211133403.208920-8-jolsa@kernel.org>
Date: Wed, 11 Dec 2024 14:33:56 +0100
From: Jiri Olsa <jolsa@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrii Nakryiko <andrii@...nel.org>
Cc: bpf@...r.kernel.org,
	Song Liu <songliubraving@...com>,
	Yonghong Song <yhs@...com>,
	John Fastabend <john.fastabend@...il.com>,
	Hao Luo <haoluo@...gle.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Alan Maguire <alan.maguire@...cle.com>,
	linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org
Subject: [PATCH bpf-next 07/13] uprobes/x86: Add support to emulate nop5 instruction

Adding support to emulate nop5 as the original uprobe instruction.

This speeds up uprobes on top of nop5 instructions:
(results from benchs/run_bench_uprobes.sh)

current:

     uprobe-nop     :    3.252 ± 0.019M/s
     uprobe-push    :    3.097 ± 0.002M/s
     uprobe-ret     :    1.116 ± 0.001M/s
 --> uprobe-nop5    :    1.115 ± 0.001M/s
     uretprobe-nop  :    1.731 ± 0.016M/s
     uretprobe-push :    1.673 ± 0.023M/s
     uretprobe-ret  :    0.843 ± 0.009M/s
 --> uretprobe-nop5 :    1.124 ± 0.001M/s

after the change:

     uprobe-nop     :    3.281 ± 0.003M/s
     uprobe-push    :    3.085 ± 0.003M/s
     uprobe-ret     :    1.130 ± 0.000M/s
 --> uprobe-nop5    :    3.276 ± 0.007M/s
     uretprobe-nop  :    1.716 ± 0.016M/s
     uretprobe-push :    1.651 ± 0.017M/s
     uretprobe-ret  :    0.846 ± 0.006M/s
 --> uretprobe-nop5 :    3.279 ± 0.002M/s

Strangely I can see uretprobe-nop5 is now much faster compared to
uretprobe-nop, while perf profiles for both are almost identical.
I'm still checking on that.

Signed-off-by: Jiri Olsa <jolsa@...nel.org>
---
 arch/x86/kernel/uprobes.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 23e4f2821cff..cdea97f8cd39 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -909,6 +909,11 @@ static const struct uprobe_xol_ops push_xol_ops = {
 	.emulate  = push_emulate_op,
 };
 
+static int is_nop5_insn(uprobe_opcode_t *insn)
+{
+	return !memcmp(insn, x86_nops[5], 5);
+}
+
 /* Returns -ENOSYS if branch_xol_ops doesn't handle this insn */
 static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
 {
@@ -928,6 +933,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
 		break;
 
 	case 0x0f:
+		if (is_nop5_insn((uprobe_opcode_t *) &auprobe->insn))
+			goto setup;
 		if (insn->opcode.nbytes != 2)
 			return -ENOSYS;
 		/*
-- 
2.47.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ