[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yo4k2Y8oNcKG5ca0@FVFF77S0Q05N>
Date: Wed, 25 May 2022 13:45:13 +0100
From: Mark Rutland <mark.rutland@....com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com>,
cj.chengjian@...wei.com, huawei.libin@...wei.com,
xiexiuqi@...wei.com, liwei391@...wei.com,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
catalin.marinas@....com, will@...nel.org, zengshun.wu@...look.com
Subject: Re: [RFC PATCH -next v2 0/4] arm64/ftrace: support dynamic trampoline
On Thu, Apr 21, 2022 at 08:37:58AM -0400, Steven Rostedt wrote:
> On Thu, 21 Apr 2022 09:13:01 +0800
> "Wangshaobo (bobo)" <bobo.shaobowang@...wei.com> wrote:
>
> > Not yet, Steve, ftrace_location() looks has no help to find a right
> > rec->ip in our case,
> >
> > ftrace_location() can find a right rec->ip when input ip is in the range
> > between
> >
> > sym+0 and sym+$end, but our question is how to identify rec->ip from
> > __mcount_loc,
>
> Are you saying that the "ftrace location" is not between sym+0 and sym+$end?
IIUC yes -- this series as-is moves the call to the trampoline *before* sym+0.
Among other things that completely wrecks backtracing, so I'd *really* like to
avoid that (hance my suggested alternative).
> > this changed the patchable entry before bti to after in gcc:
> >
> > [1] https://reviews.llvm.org/D73680
> >
> > gcc tells the place of first nop of the 5 NOPs when using
> > -fpatchable-function-entry=5,3,
> >
> > but not tells the first nop after bti, so we don't know how to adjust
> > our rec->ip for ftrace.
>
> OK, so I do not understand how the compiler is injecting bti with mcount
> calls, so I'll just walk away for now ;-)
When using BTI, the compiler has to drop a BTI *at* the function entry point
(i.e. sym+0) for any function that can be called indirectly, but can omit this
when the function is only directly called (which is the case for most functions
created via insterprocedural specialization, or for a number of static
functions).
Today, when we pass:
-fpatchable-function-entry=2
... the compiler places 2 NOPs *after* any BTI, and records the location of the
first NOP. So the two cases we get are:
__func_without_bti:
NOP <--- recorded location
NOP
__func_with_bti:
BTI
NOP <--- recorded location
NOP
... which works just fine, since either sym+0 or sym+4 are reasonable
locations for the patch-site to live.
However, if we were to pass:
-fpatchable-function-entry=5,3
... the compiler places 3 NOPs *before* any BTI, and 2 NOPs *after* any BTI,
still recording the location of the first NOP. So in the two cases we get:
NOP <--- recorded location
NOP
NOP
__func_without_bti:
NOP
NOP
NOP <--- recorded location
NOP
NOP
__func_with_bti:
BTI
NOP
NOP
... so where we want to patch one of the later nops to banch to a pre-function
NOP, we need to know whether or not the compiler generated a BTI. We can
discover discover that either by:
* Checking whether the recorded location is at sym+0 (no BTI) or sym+4 (BTI).
* Reading the instruction before the recorded location, and seeing if this is a
BTI.
... and depending on how we handle thigns the two cases *might* need different
trampolines.
Thanks,
Mark.
Powered by blists - more mailing lists