linux-kernel - Re: [PATCH 03/11] rcu/nocb: Invoke rcu_core() at the start of deoffloading

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YWg3QthesE5XMeLj@boqun-archlinux>
Date:   Thu, 14 Oct 2021 21:57:22 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Valentin Schneider <Valentin.Schneider@....com>
Cc:     Frederic Weisbecker <frederic@...nel.org>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Uladzislau Rezki <urezki@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Neeraj Upadhyay <neeraju@...eaurora.org>,
        Josh Triplett <josh@...htriplett.org>,
        Joel Fernandes <joel@...lfernandes.org>, rcu@...r.kernel.org
Subject: Re: [PATCH 03/11] rcu/nocb: Invoke rcu_core() at the start of
 deoffloading

On Thu, Oct 14, 2021 at 12:42:40PM +0100, Valentin Schneider wrote:
> On 14/10/21 00:07, Boqun Feng wrote:
> >> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >> index e38028d48648..b236271b9022 100644
> >> --- a/kernel/rcu/tree.c
> >> +++ b/kernel/rcu/tree.c
> >> @@ -2717,6 +2717,23 @@ static __latent_entropy void rcu_core(void)
> >>      unsigned long flags;
> >>      struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
> >>      struct rcu_node *rnp = rdp->mynode;
> >> +	/*
> >> +	 * On RT rcu_core() can be preempted when IRQs aren't disabled.
> >> +	 * Therefore this function can race with concurrent NOCB (de-)offloading
> >> +	 * on this CPU and the below condition must be considered volatile.
> >> +	 * However if we race with:
> >> +	 *
> >> +	 * _ Offloading:   In the worst case we accelerate or process callbacks
> >> +	 *                 concurrently with NOCB kthreads. We are guaranteed to
> >> +	 *                 call rcu_nocb_lock() if that happens.
> >
> > If offloading races with rcu_core(), can the following happen?
> >
> >       <offload work>
> >       rcu_nocb_rdp_offload():
> >                                               rcu_core():
> >                                                 ...
> >                                                 rcu_nocb_lock_irqsave(); // no a lock
> >         raw_spin_lock_irqsave(->nocb_lock);
> >           rdp_offload_toggle():
> >             <LOCKING | OFFLOADED set>
> >                                                 if (!rcu_segcblist_restempty(...))
> >                                                   rcu_accelerate_cbs_unlocked(...);
> >                                                 rcu_nocb_unlock_irqrestore();
> >                                                 // ^ a real unlock,
> >                                                 // and will preempt_enable()
> >           // offload continue with ->nocb_lock not held
> >
> > If this can happen, it means an unpaired preempt_enable() and an
> > incorrect unlock. Thoughts? Maybe I'm missing something here?
> >
> 
> As Frederic pointed out in an earlier thread [1], this can't happen because
> we still have IRQs disabled, and the offloading process has to be processed
> on the CPU being offloaded. IOW, in the above scenario, rcu_core() can't be
> preempted by the rcu_nocb_rdp_offload() work until it re-enables IRQs at
> rcu_nocb_unlock_irqrestore().
> 
> (hopefully Paul or Frederic will correct me if I've just spewed garbage)
> 
> [1]: https://lore.kernel.org/lkml/20210930105340.GA232241@lothringen/
> 

Thanks! I think Frederic and you are right: this cannot happen. Thank
you both for looking into this ;-)

Regards,
Boqun

> > Regards,
> > Boqun
> >