linux-kernel - Re: [RFC PATCH -next v2 0/4] arm64/ftrace: support dynamic trampoline

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yo4k2Y8oNcKG5ca0@FVFF77S0Q05N>
Date:   Wed, 25 May 2022 13:45:13 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>,
        cj.chengjian@...wei.com, huawei.libin@...wei.com,
        xiexiuqi@...wei.com, liwei391@...wei.com,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        catalin.marinas@....com, will@...nel.org, zengshun.wu@...look.com
Subject: Re: [RFC PATCH -next v2 0/4] arm64/ftrace: support dynamic trampoline

On Thu, Apr 21, 2022 at 08:37:58AM -0400, Steven Rostedt wrote:
> On Thu, 21 Apr 2022 09:13:01 +0800
> "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com> wrote:
> 
> > Not yet, Steve, ftrace_location() looks has no help to find a right 
> > rec->ip in our case,
> > 
> > ftrace_location() can find a right rec->ip when input ip is in the range 
> > between
> > 
> > sym+0 and sym+$end, but our question is how to  identify rec->ip from 
> > __mcount_loc,
> 
> Are you saying that the "ftrace location" is not between sym+0 and sym+$end?

IIUC yes -- this series as-is moves the call to the trampoline *before* sym+0.

Among other things that completely wrecks backtracing, so I'd *really* like to
avoid that (hance my suggested alternative).

> > this changed the patchable entry before bti to after in gcc:
> > 
> >     [1] https://reviews.llvm.org/D73680
> > 
> > gcc tells the place of first nop of the 5 NOPs when using 
> > -fpatchable-function-entry=5,3,
> > 
> > but not tells the first nop after bti, so we don't know how to adjust 
> > our rec->ip for ftrace.
> 
> OK, so I do not understand how the compiler is injecting bti with mcount
> calls, so I'll just walk away for now ;-)

When using BTI, the compiler has to drop a BTI *at* the function entry point
(i.e. sym+0) for any function that can be called indirectly, but can omit this
when the function is only directly called (which is the case for most functions
created via insterprocedural specialization, or for a number of static
functions).

Today, when we pass:

	-fpatchable-function-entry=2

... the compiler places 2 NOPs *after* any BTI, and records the location of the
first NOP. So the two cases we get are:

	__func_without_bti:
		NOP		<--- recorded location
		NOP

	__func_with_bti:
		BTI
		NOP		<--- recorded location
		NOP

... which works just fine, since either sym+0 or sym+4 are reasonable
locations for the patch-site to live.

However, if we were to pass:

	-fpatchable-function-entry=5,3

... the compiler places 3 NOPs *before* any BTI, and 2 NOPs *after* any BTI,
still recording the location of the first NOP. So in the two cases we get:

		NOP		<--- recorded location
		NOP
		NOP
	__func_without_bti:
		NOP
		NOP

		NOP		<--- recorded location
		NOP
		NOP
	__func_with_bti:
		BTI
		NOP
		NOP

... so where we want to patch one of the later nops to banch to a pre-function
NOP, we need to know whether or not the compiler generated a BTI. We can
discover discover that either by:

* Checking whether the recorded location is at sym+0 (no BTI) or sym+4 (BTI).

* Reading the instruction before the recorded location, and seeing if this is a
  BTI.

... and depending on how we handle thigns the two cases *might* need different
trampolines.

Thanks,
Mark.