[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1560935530.70niyxru6o.naveen@linux.ibm.com>
Date: Wed, 19 Jun 2019 15:23:26 +0530
From: "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>
To: Masami Hiramatsu <mhiramat@...nel.org>,
Ingo Molnar <mingo@...nel.org>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Steven Rostedt <rostedt@...dmis.org>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 4/7] powerpc/ftrace: Additionally nop out the preceding
mflr with -mprofile-kernel
Nicholas Piggin wrote:
> Michael Ellerman's on June 19, 2019 3:14 pm:
>> Hi Naveen,
>>
>> Sorry I meant to reply to this earlier .. :/
No problem. Thanks for the questions.
>>
>> "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com> writes:
>>> With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to
>>> enable function tracing and profiling. So far, with dynamic ftrace, we
>>> used to only patch out the branch to _mcount(). However, mflr is
>>> executed by the branch unit that can only execute one per cycle on
>>> POWER9 and shared with branches, so it would be nice to avoid it where
>>> possible.
>>>
>>> We cannot simply nop out the mflr either. When enabling function
>>> tracing, there can be a race if tracing is enabled when some thread was
>>> interrupted after executing a nop'ed out mflr. In this case, the thread
>>> would execute the now-patched-in branch to _mcount() without having
>>> executed the preceding mflr.
>>>
>>> To solve this, we now enable function tracing in 2 steps: patch in the
>>> mflr instruction, use synchronize_rcu_tasks() to ensure all existing
>>> threads make progress, and then patch in the branch to _mcount(). We
>>> override ftrace_replace_code() with a powerpc64 variant for this
>>> purpose.
>>
>> According to the ISA we're not allowed to patch mflr at runtime. See the
>> section on "CMODX".
>
> According to "quasi patch class" engineering note, we can patch
> anything with a preferred nop. But that's written as an optional
> facility, which we don't have a feature to test for.
>
Hmm... I wonder what the implications are. We've been patching in a
'trap' for kprobes for a long time now, along with having to patch back
the original instruction (which can be anything), when the probe is
removed.
>>
>> I'm also not convinced the ordering between the two patches is
>> guaranteed by the ISA, given that there's possibly no isync on the other
>> CPU.
>
> Will they go through a context synchronizing event?
>
> synchronize_rcu_tasks() should ensure a thread is scheduled away, but
> I'm not actually sure it guarantees CSI if it's kernel->kernel. Could
> do a smp_call_function to do the isync on each CPU to be sure.
Good point. Per
Documentation/RCU/Design/Requirements/Requirements.html#Tasks RCU:
"The solution, in the form of Tasks RCU, is to have implicit read-side
critical sections that are delimited by voluntary context switches, that
is, calls to schedule(), cond_resched(), and synchronize_rcu_tasks(). In
addition, transitions to and from userspace execution also delimit
tasks-RCU read-side critical sections."
I suppose transitions to/from userspace, as well as calls to schedule()
result in context synchronizing instruction being executed. But, if some
tasks call cond_resched() and synchronize_rcu_tasks(), we probably won't
have a CSI executed.
Also:
"In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, so these
APIs map to call_rcu(), synchronize_rcu(), and rcu_barrier(),
respectively."
In this scenario as well, I think we won't have a CSI executed in case
of cond_resched().
Should we enhance patch_instruction() to handle that?
- Naveen
Powered by blists - more mailing lists