linux-kernel - Re: linux-next: build warnings after merge of the tip tree

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yjnvsrp8253bxWPA@hirez.programming.kicks-ass.net>
Date:   Tue, 22 Mar 2022 16:48:02 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Next Mailing List <linux-next@...r.kernel.org>,
        mhiramat@...nel.org, ast@...nel.org, hjl.tools@...il.com,
        rick.p.edgecombe@...el.com, rppt@...nel.org,
        linux-toolchains@...r.kernel.org, Andrew.Cooper3@...rix.com,
        ndesaulniers@...gle.com
Subject: Re: linux-next: build warnings after merge of the tip tree

On Tue, Mar 22, 2022 at 11:04:38AM -0400, Steven Rostedt wrote:

> > In recap:
> > 
> > 	__fentry__ -- push on trace-stack
> > 	__ftail__  -- mark top-most entry complete
> > 	__fexit__  -- mark top-most entry complete;
> > 	              pop all completed entries
> 
> Again, this would require that the tail-calls are also being traced.

Which is why we should inhibit tail-calls if the function is notrace.

> > inhibit tail-calls to notrace.
> 
> Just inhibiting tail-calls to notrace would work without any of the above.

I'm lost again; what? Without any of the above you got nothing because
return-trampoline will not work.

> But my fear is that will cause a noticeable performance impact.

Most code isn't in fact notrace, and call+ret aren't *that* expensive.

> > It's function graph tracing, kretprobes and whatever else this rethook
> > stuff is about that needs this because return trampolines will stop
> > working somewhere in the not too distant future.
> 
> Another crazy solution is to have:
> 
> func_A:
> 	call __fentry__
> 	...
> tail:	jmp 1f 
> 	call 1f
	
> 	call __fexit__
> 	ret
> 1:	jmp func_B
> 
> 
> where the compiler tells us about "tail:" and that we know that func_A ends
> with a tail call, and if we want to trace the end of func_A we convert that
> jmp 1f into a nop. And then we call the func_B and it's return comes back
> to where we call __fexit__ and then return normally.

At that point giving us something like:

1:
	pushsection __ftail_loc
	.long	1b - .
	popsection

	jmp.d32	func_B
	call	__fexit__
	ret

is smaller and simpler, we can patch the jmp.d32 to call when tracing.
The only problem is SLS, that might wants an int3 after jmp too
( https://www.amd.com/en/corporate/product-security/bulletin/amd-sb-1026 ).

That does avoid the need for __ftail__ I suppose.