linux-kernel - Re: Instrumentation and RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200309235210.GB20868@lenoir>
Date:   Tue, 10 Mar 2020 00:52:11 +0100
From:   Frederic Weisbecker <frederic@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Joel Fernandes <joel@...lfernandes.org>
Subject: Re: Instrumentation and RCU

On Mon, Mar 09, 2020 at 01:47:10PM -0700, Paul E. McKenney wrote:
> On Mon, Mar 09, 2020 at 06:02:32PM +0100, Thomas Gleixner wrote:
> > #3) RCU idle
> > 
> >     Being able to trace code inside RCU idle sections is very similar to
> >     the question raised in #1.
> > 
> >     Assume all of the instrumentation would be doing conditional RCU
> >     schemes, i.e.:
> > 
> >     if (rcuidle)
> >     	....
> >     else
> >         rcu_read_lock_sched()
> > 
> >     before invoking the actual instrumentation functions and of course
> >     undoing that right after it, that really begs the question whether
> >     it's worth it.
> > 
> >     Especially constructs like:
> > 
> >     trace_hardirqs_off()
> >        idx = srcu_read_lock()
> >        rcu_irq_enter_irqson();
> >        ...
> >        rcu_irq_exit_irqson();
> >        srcu_read_unlock(idx);
> > 
> >     if (user_mode)
> >        user_exit_irqsoff();
> >     else
> >        rcu_irq_enter();
> > 
> >     are really more than questionable. For 99.9999% of instrumentation
> >     users it's absolutely irrelevant whether this traces the interrupt
> >     disabled time of user_exit_irqsoff() or rcu_irq_enter() or not.
> > 
> >     But what's relevant is the tracer overhead which is e.g. inflicted
> >     with todays trace_hardirqs_off/on() implementation because that
> >     unconditionally uses the rcuidle variant with the scru/rcu_irq dance
> >     around every tracepoint.
> > 
> >     Even if the tracepoint sits in the ASM code it just covers about ~20
> >     low level ASM instructions more. The tracer invocation, which is
> >     even done twice when coming from user space on x86 (the second call
> >     is optimized in the tracer C-code), costs definitely way more
> >     cycles. When you take the scru/rcu_irq dance into account it's a
> >     complete disaster performance wise.
> 
> Suppose that we had a variant of RCU that had about the same read-side
> overhead as Preempt-RCU, but which could be used from idle as well as
> from CPUs in the process of coming online or going offline?  I have not
> thought through the irq/NMI/exception entry/exit cases, but I don't see
> why that would be problem.
> 
> This would have explicit critical-section entry/exit code, so it would
> not be any help for trampolines.
> 
> Would such a variant of RCU help?
> 
> Yeah, I know.  Just what the kernel doesn't need, yet another variant
> of RCU...
> 

I was thinking about having a tracing-specific implementation of RCU.
Last week Steve told me that the tracing ring buffer has its own ad-hoc
RCU implementation which schedule a thread on each CPU to complete a grace
period (did I understand it right?). Of course such a flavour of RCU wouldn't
be nice to nohz_full but surely we can arrange some tweaks for those who
require strong isolation. I'm sure you're having a much better idea though.