[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210728220137.GD293265@lothringen>
Date: Thu, 29 Jul 2021 00:01:37 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Valentin Schneider <valentin.schneider@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-rt-users@...r.kernel.org,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Steven Rostedt <rostedt@...dmis.org>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Josh Triplett <josh@...htriplett.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Joel Fernandes <joel@...lfernandes.org>,
Anshuman Khandual <anshuman.khandual@....com>,
Vincenzo Frascino <vincenzo.frascino@....com>,
Steven Price <steven.price@....com>,
Ard Biesheuvel <ardb@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH 2/3] rcu/nocb: Check for migratability rather than pure
preemptability
On Wed, Jul 28, 2021 at 08:34:14PM +0100, Valentin Schneider wrote:
> On 28/07/21 01:08, Frederic Weisbecker wrote:
> > On Wed, Jul 21, 2021 at 12:51:17PM +0100, Valentin Schneider wrote:
> >> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
> >> ---
> >> kernel/rcu/tree_plugin.h | 3 +--
> >> 1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> >> index ad0156b86937..6c3c4100da83 100644
> >> --- a/kernel/rcu/tree_plugin.h
> >> +++ b/kernel/rcu/tree_plugin.h
> >> @@ -70,8 +70,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> >> !(lockdep_is_held(&rcu_state.barrier_mutex) ||
> >> (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) ||
> >> rcu_lockdep_is_held_nocb(rdp) ||
> >> - (rdp == this_cpu_ptr(&rcu_data) &&
> >> - !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) ||
> >> + (rdp == this_cpu_ptr(&rcu_data) && is_pcpu_safe()) ||
> >
> > I fear that won't work. We really need any caller of rcu_rdp_is_offloaded()
> > on the local rdp to have preemption disabled and not just migration disabled,
> > because we must protect against concurrent offloaded state changes.
> >
> > The offloaded state is changed by a workqueue that executes on the target rdp.
> >
> > Here is a practical example where it matters:
> >
> > CPU 0
> > -----
> > // =======> task rcuc running
> > rcu_core {
> > rcu_nocb_lock_irqsave(rdp, flags) {
> > if (!rcu_segcblist_is_offloaded(rdp->cblist)) {
> > // is not offloaded right now, so it's going
> > // to just disable IRQs. Oh no wait:
> > // preemption
> > // ========> workqueue running
> > rcu_nocb_rdp_offload();
> > // ========> task rcuc resume
> > local_irq_disable();
> > }
> > }
> > ....
> > rcu_nocb_unlock_irqrestore(rdp, flags) {
> > if (rcu_segcblist_is_offloaded(rdp->cblist)) {
> > // is offloaded right now so:
> > raw_spin_unlock_irqrestore(rdp, flags);
> >
> > And that will explode because that's an impaired unlock on nocb_lock.
>
> Harumph, that doesn't look good, thanks for pointing this out.
>
> AFAICT PREEMPT_RT doesn't actually require to disable softirqs here (since
> it forces RCU callbacks on the RCU kthreads), but disabled softirqs seem to
> be a requirement for much of the underlying functions and even some of the
> callbacks (delayed_put_task_struct() ~> vfree() pays close attention to
> in_interrupt() for instance).
>
> Now, if the offloaded state was (properly) protected by a local_lock, do
> you reckon we could then keep preemption enabled?
I guess we could take such a local lock on the update side
(rcu_nocb_rdp_offload) and then take it on rcuc kthread/softirqs
and maybe other places.
But we must make sure that rcu_core() is preempt-safe from a general perspective
in the first place. From a quick glance I can't find obvious issues...yet.
Paul maybe you can see something?
>
> From a naive outsider PoV, rdp->nocb_lock looks like a decent candidate,
> but it's a *raw* spinlock (I can't tell right now whether changing this is
> a horrible idea or not), and then there's
Yeah that's not possible, nocb_lock is too low level and has to be called with
IRQs disabled. So if we take that local_lock solution, we need a new lock.
Thanks.
Powered by blists - more mailing lists