[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wi3o-wwFVbAXb7YZZViDBsZ_yMVqyOAEZsx5qcskLsOcg@mail.gmail.com>
Date: Sun, 29 Nov 2020 11:31:41 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...nel.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6
On Sun, Nov 29, 2020 at 5:38 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Yet two more places which invoke tracing from RCU disabled regions in the
> idle path. Similar to the entry path the low level idle functions have to
> be non-instrumentable.
This really seems less than optimal.
In particular, lookie here:
> @@ -94,9 +94,35 @@ void __cpuidle default_idle_call(void)
>
> trace_cpu_idle(1, smp_processor_id());
> stop_critical_timings();
> +
> + /*
> + * arch_cpu_idle() is supposed to enable IRQs, however
> + * we can't do that because of RCU and tracing.
> + *
> + * Trace IRQs enable here, then switch off RCU, and have
> + * arch_cpu_idle() use raw_local_irq_enable(). Note that
> + * rcu_idle_enter() relies on lockdep IRQ state, so switch that
> + * last -- this is very similar to the entry code.
> + */
> + trace_hardirqs_on_prepare();
> + lockdep_hardirqs_on_prepare(_THIS_IP_);
> rcu_idle_enter();
> + lockdep_hardirqs_on(_THIS_IP_);
> +
> arch_cpu_idle();
> +
> + /*
> + * OK, so IRQs are enabled here, but RCU needs them disabled to
> + * turn itself back on.. funny thing is that disabling IRQs
> + * will cause tracing, which needs RCU. Jump through hoops to
> + * make it 'work'.
> + */
> + raw_local_irq_disable();
> + lockdep_hardirqs_off(_THIS_IP_);
> rcu_idle_exit();
> + lockdep_hardirqs_on(_THIS_IP_);
> + raw_local_irq_enable();
> +
> start_critical_timings();
> trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
> }
And look at what the code generation for the idle exit path is when
lockdep isn't even on.
It's *literally*
cli
call rcu_idle_exit
sti
and guess what rcu_idle_exit does?
Yeah, that one does "pushf; cli; call rcu_eqs_exit; popf".
So here we are, in the somewhat critical "an interrupt woke us up"
section, and we're doing just ridiculously stupid things.
I've pulled this, because it solves a problem, but there's a deeper
problem here in how all this is done.
The idle path is actually quite important. I can point to real loads
where this is a big part of the CPU profile, because you end up having
lots of "go to sleep for very short times, because the thing we were
waiting for takes almost no time at all".
Linus
Powered by blists - more mailing lists