[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87imjc5f6a.fsf@nanos.tec.linutronix.de>
Date: Tue, 10 Mar 2020 17:48:45 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
paulmck <paulmck@...nel.org>,
"Joel Fernandes\, Google" <joel@...lfernandes.org>,
Frederic Weisbecker <frederic@...nel.org>
Subject: Re: Instrumentation and RCU
Mathieu Desnoyers <mathieu.desnoyers@...icios.com> writes:
> ----- On Mar 9, 2020, at 3:52 PM, Thomas Gleixner tglx@...utronix.de wrote:
>> In a quick test I did with a invalid syscall number with profiling the
>> trace_hardirqs_off() is pretty prominent and goes down by roughly a
>> factor of 2 when I move it past enter_from_user_mode() and use just the
>> non RCU idle variant.
>
> I think one issue here is that trace_hardirqs_off() is now shared between
> lockdep and tracing. For lockdep, we have the following comment:
>
> /*
> * IRQ from user mode.
> *
> * We need to tell lockdep that IRQs are off. We can't do this until
> * we fix gsbase, and we should do it before enter_from_user_mode
> * (which can take locks). Since TRACE_IRQS_OFF is idempotent,
> * the simplest way to handle it is to just call it twice if
> * we enter from user mode. There's no reason to optimize this since
> * TRACE_IRQS_OFF is a no-op if lockdep is off.
> */
> TRACE_IRQS_OFF
>
> CALL_enter_from_user_mode
>
> 1:
> ENTER_IRQ_STACK old_rsp=%rdi save_ret=1
> /* We entered an interrupt context - irqs are off: */
> TRACE_IRQS_OFF
>
> which seems to imply that lockdep requires TRACE_IRQS_OFF to be performed
> _before_ entering from usermode. I don't expect this to be useful at all for
> other tracers though. I think this should be replaced by a new e.g.
> LOCKDEP_ENTER_FROM_USER_MODE or such which would call into lockdep without
> calling other tracers.
See the entry series I'm working on. Aside of moving all this nonsense
into C-code it splits lockdep and tracing so it looks like this:
lockdep_hardirqs_off();
user_exit_irqsoff();
__trace_hardirqs_off();
The latter uses regular RCU and not the scru/rcu_irq dance.
>> Right, but that still does the whole rcu_irq dance especially in the
>> entry code just to trace 50 or 100 instructions which are turning on RCU
>> anyway.
>
> Agreed. Would changing this to a lockdep-specific call as I suggest above
> solve this ?
That split exist for a few weeks now at least in my patches :)
>>> If a tracer recurses, or if a tracer attempts to trace another tracer, the
>>> instrumentation would break the recursion chain by preventing instrumentation
>>> from firing. If we end up caring about tracers tracing other tracers, we could
>>> have one distinct flag per tracer and let each tracer break the recursion chain.
>>>
>>> Having this flag per kernel stack rather than per CPU or per thread would
>>> allow tracing of nested interrupt handlers (and NMIs), but would break
>>> call chains both within the same stack or going through a trap. I think
>>> it could be a nice complementary safety net to handle mishaps in a non-fatal
>>> way.
>>
>> That works as long as none of this uses breakpoint based patching to
>> dynamically disable/enable stuff.
>
> I'm clearly missing something here. I was expecting the "in_tracing" flag trick
> to be able to fix the breakpoint recursion issue. What is the problem I'm missing
> here ?
How do you "fix" that when you can't reach the tracepoint because you
trip over a breakpoint and then while trying to fixup that stuff you hit
another one?
Thanks,
tglx
Powered by blists - more mailing lists