[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200903153555.GN29330@paulmck-ThinkPad-P72>
Date: Thu, 3 Sep 2020 08:35:55 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: peterz@...radead.org
Cc: Ulf Hansson <ulf.hansson@...aro.org>,
Lina Iyer <ilina@...eaurora.org>,
Naresh Kamboju <naresh.kamboju@...aro.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Saravana Kannan <saravanak@...gle.com>,
open list <linux-kernel@...r.kernel.org>,
linux-mmc <linux-mmc@...r.kernel.org>,
lkft-triage@...ts.linaro.org, rcu@...r.kernel.org,
Linux PM <linux-pm@...r.kernel.org>,
Anders Roxell <anders.roxell@...aro.org>,
Arnd Bergmann <arnd@...db.de>,
Rajendra Nayak <rnayak@...eaurora.org>,
John Stultz <john.stultz@...aro.org>,
Stephen Boyd <sboyd@...nel.org>,
Lars Povlsen <lars.povlsen@...rochip.com>,
madhuparnabhowmik10@...il.com,
Viresh Kumar <viresh.kumar@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC][PATCH] cpu_pm: Remove RCU abuse
On Thu, Sep 03, 2020 at 05:08:19PM +0200, peterz@...radead.org wrote:
> On Thu, Sep 03, 2020 at 04:36:35PM +0200, Ulf Hansson wrote:
> > On Thu, 3 Sep 2020 at 15:53, <peterz@...radead.org> wrote:
> > > static int cpu_pm_notify(enum cpu_pm_event event)
> > > {
> > > int ret;
> > >
> > > + lockdep_assert_irqs_disabled();
> >
> > Nitpick, maybe the lockdep should be moved to a separate patch.
>
> Well, the unregister relies on IRQs being disabled here, so I figured
> asserting this was a good thing ;-)
>
> Starting the audit below, this might not in fact be true, which then
> invalidates the unregister implementation. In particular the notifier in
> arch/arm/kernel/hw_breakpoint.c seems to unconditionally enable IRQs.
>
> > > + ret = raw_notifier_call_chain(&cpu_pm_notifier_chain, event, NULL);
> >
> > Converting to raw_notifiers seems reasonable - if we need to avoid the
> > RCU usage.
> >
> > My point is, I wonder about if the notifier callbacks themselves are
> > safe from RCU usage. For example, I would not be surprised if tracing
> > is happening behind them.
>
> A bunch of them seem to call into the clk domain stuff, and I think
> there's tracepoints in that.
>
> > Moreover, I am not sure that we really need to prevent and limit
> > tracing from happening. Instead we could push rcu_idle_enter|exit()
> > further down to the arch specific code in the cpuidle drivers, as you
> > kind of all proposed earlier.
>
> Well, at some point the CPU is in a really dodgy state, ISTR there being
> ARM platforms where you have to manually leave the cache coherency
> fabric and all sorts of insanity. There should be a definite cut-off on
> tracing before that.
>
> Also, what is the point of all this clock and power domain callbacks, if
> not to put the CPU into an extremely low power state, surely you want to
> limit the amount of code that's ran when the CPU is in such a state.
>
> > In this way, we can step by step, move to a new "version" of
> > cpu_pm_enter() that doesn't have to deal with rcu_irq_enter_irqson(),
> > because RCU hasn't been pushed to idle yet.
>
> That should be easy enough to audit. The thing is that mainline is now
> generating (debug) splats, and some people are upset with this.
>
> If you're ok with ARM not being lockdep clean while this is being
> reworked I'm perfectly fine with that.
>
> (There used to be a separate CONFIG for RCU-lockdep, but that seems to
> have been removed)
CONFIG_PROVE_RCU still gates RCU_LOCKDEP_WARN(), but it is now a
def_bool that follows CONFIG_PROVE_LOCKING.
It would not be hard to make CONFIG_PROVE_RCU separately settable only
for arm, if that would help.
Thanx, Paul
Powered by blists - more mailing lists