lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Feb 2022 00:09:20 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     x86@...nel.org, joao@...rdrivepizza.com, hjl.tools@...il.com,
        jpoimboe@...hat.com, andrew.cooper3@...rix.com,
        linux-kernel@...r.kernel.org, ndesaulniers@...gle.com,
        keescook@...omium.org, samitolvanen@...gle.com,
        mark.rutland@....com, alyssa.milburn@...el.com, mbenes@...e.cz,
        mhiramat@...nel.org, alexei.starovoitov@...il.com
Subject: Re: [PATCH v2 14/39] x86/ibt,ftrace: Make function-graph play nice

On Thu, Feb 24, 2022 at 10:42:30AM -0500, Steven Rostedt wrote:
> On Thu, 24 Feb 2022 16:36:57 +0100
> Peter Zijlstra <peterz@...radead.org> wrote:
> 
> > On Thu, Feb 24, 2022 at 03:51:52PM +0100, Peter Zijlstra wrote:
> > > @@ -316,10 +317,12 @@ SYM_FUNC_START(return_to_handler)
> > >  
> > >  	call ftrace_return_to_handler
> > >  
> > > -	movq %rax, %rdi
> > > +	movq %rax, 16(%rsp)
> > >  	movq 8(%rsp), %rdx
> > >  	movq (%rsp), %rax
> > > -	addq $24, %rsp
> > > -	JMP_NOSPEC rdi
> > > +
> > > +	addq $16, %rsp
> > > +	UNWIND_HINT_FUNC
> > > +	RET
> > >  SYM_FUNC_END(return_to_handler)
> > >  #endif  
> > 
> > While talking about this with Mark, an alternative solution is something
> > like this, that would keep the RSB balanced and only mess up the current
> > return.
> > 
> > No idea it if makes an appreciatable difference on current hardware,
> > therefore I went with the simpler option above.
> > 
> > @@ -307,7 +315,7 @@ EXPORT_SYMBOL(__fentry__)
> >  
> >  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> >  SYM_FUNC_START(return_to_handler)
> > -	subq  $24, %rsp
> > +	subq  $16, %rsp
> >  
> >  	/* Save the return values */
> >  	movq %rax, (%rsp)
> > @@ -319,7 +327,13 @@ SYM_FUNC_START(return_to_handler)
> >  	movq %rax, %rdi
> >  	movq 8(%rsp), %rdx
> >  	movq (%rsp), %rax
> > -	addq $24, %rsp
> > -	JMP_NOSPEC rdi
> > +
> > +	addq $16, %rsp
> > +	ANNOTATE_INTRA_FUNCTION_CALL
> > +	call .Ldo_rop
> > +.Ldo_rop:
> 
> What's the overhead of an added call (for every function call that is being
> traced)?

Who knows :-) That's all u-arch magic and needs testing (on lots of
hardware). I suspect the dominating cost of all this code is the RSB
miss, not a few regular instructions.

So with this alternative we'll get one guaranteed RSB miss per
construction, but at least the RSB should be mostly good again
afterwards.

With the patch as proposed, the RSB is basically scrap due to
unbalanced.

The original patch changing it to an indirect call cited <2% performance
improvement IIRC, and that all was pre-speculation mess. This
alternative isn't much different from a retpoline (all that's missing is
the speculation trap).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ