lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1wwgIPh7dieKSPV@krava>
Date: Fri, 13 Dec 2024 14:02:56 +0100
From: Jiri Olsa <olsajiri@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Oleg Nesterov <oleg@...hat.com>, Andrii Nakryiko <andrii@...nel.org>,
	bpf@...r.kernel.org, Song Liu <songliubraving@...com>,
	Yonghong Song <yhs@...com>,
	John Fastabend <john.fastabend@...il.com>,
	Hao Luo <haoluo@...gle.com>, Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Alan Maguire <alan.maguire@...cle.com>,
	linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next 07/13] uprobes/x86: Add support to emulate nop5
 instruction

On Fri, Dec 13, 2024 at 11:45:36AM +0100, Peter Zijlstra wrote:
> On Wed, Dec 11, 2024 at 02:33:56PM +0100, Jiri Olsa wrote:
> > Adding support to emulate nop5 as the original uprobe instruction.
> > 
> > This speeds up uprobes on top of nop5 instructions:
> > (results from benchs/run_bench_uprobes.sh)
> > 
> > current:
> > 
> >      uprobe-nop     :    3.252 ± 0.019M/s
> >      uprobe-push    :    3.097 ± 0.002M/s
> >      uprobe-ret     :    1.116 ± 0.001M/s
> >  --> uprobe-nop5    :    1.115 ± 0.001M/s
> >      uretprobe-nop  :    1.731 ± 0.016M/s
> >      uretprobe-push :    1.673 ± 0.023M/s
> >      uretprobe-ret  :    0.843 ± 0.009M/s
> >  --> uretprobe-nop5 :    1.124 ± 0.001M/s
> > 
> > after the change:
> > 
> >      uprobe-nop     :    3.281 ± 0.003M/s
> >      uprobe-push    :    3.085 ± 0.003M/s
> >      uprobe-ret     :    1.130 ± 0.000M/s
> >  --> uprobe-nop5    :    3.276 ± 0.007M/s
> >      uretprobe-nop  :    1.716 ± 0.016M/s
> >      uretprobe-push :    1.651 ± 0.017M/s
> >      uretprobe-ret  :    0.846 ± 0.006M/s
> >  --> uretprobe-nop5 :    3.279 ± 0.002M/s
> > 
> > Strangely I can see uretprobe-nop5 is now much faster compared to
> > uretprobe-nop, while perf profiles for both are almost identical.
> > I'm still checking on that.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> > ---
> >  arch/x86/kernel/uprobes.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> > index 23e4f2821cff..cdea97f8cd39 100644
> > --- a/arch/x86/kernel/uprobes.c
> > +++ b/arch/x86/kernel/uprobes.c
> > @@ -909,6 +909,11 @@ static const struct uprobe_xol_ops push_xol_ops = {
> >  	.emulate  = push_emulate_op,
> >  };
> >  
> > +static int is_nop5_insn(uprobe_opcode_t *insn)
> > +{
> > +	return !memcmp(insn, x86_nops[5], 5);
> > +}
> > +
> >  /* Returns -ENOSYS if branch_xol_ops doesn't handle this insn */
> >  static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> >  {
> > @@ -928,6 +933,8 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
> >  		break;
> >  
> >  	case 0x0f:
> > +		if (is_nop5_insn((uprobe_opcode_t *) &auprobe->insn))
> > +			goto setup;
> 
> This isn't right, this is not x86_64 specific code, and there's a bunch
> of 32bit 5 byte nops that do not start with 0f.
> 
> Also, since you already have the insn decoded, I would suggest you
> simply check OPCODE2(insn) == 0x1f /* NOPL */ and length == 5.

ah right.. ok will change, thanks

jirka

> 
> >  		if (insn->opcode.nbytes != 2)
> >  			return -ENOSYS;
> >  		/*
> > -- 
> > 2.47.0
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ