linux-kernel - Re: rcu/tree: Protect rcu_rdp_is

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210922113117.GB106513@lothringen>
Date:   Wed, 22 Sep 2021 13:31:17 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Valentin Schneider <valentin.schneider@....com>,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        rcu@...r.kernel.org, linux-rt-users@...r.kernel.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Josh Triplett <josh@...htriplett.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Steven Price <steven.price@....com>,
        Ard Biesheuvel <ardb@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Mike Galbraith <efault@....de>
Subject: Re: rcu/tree: Protect rcu_rdp_is_offloaded() invocations on RT

On Tue, Sep 21, 2021 at 07:18:37PM -0700, Paul E. McKenney wrote:
> On Wed, Sep 22, 2021 at 01:36:27AM +0200, Frederic Weisbecker wrote:
> > Doing the local_irq_save() before checking that the segcblist is offloaded
> > protect that state from being changed (provided we lock the local rdp). Then we
> > can safely manipulate cblist, whether locked or unlocked.
> > 
> > 2) The actual call to rcu_do_batch(). If we are preempted between
> > rcu_segcblist_completely_offloaded() and rcu_do_batch() with a deoffload in
> > the middle, we miss the callback invocation. Invoking rcu_core by the end of
> > deoffloading process should solve that.
> 
> Maybe invoke rcu_core() at that point?  My concern is that there might
> be an extended time between the missed rcu_do_batch() and the end of
> the deoffloading process.

Agreed!

> 
> > > Reported-by: Valentin Schneider <valentin.schneider@....com>
> > > Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> > > ---
> > >  kernel/rcu/tree.c |    7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2278,13 +2278,13 @@ rcu_report_qs_rdp(struct rcu_data *rdp)
> > >  {
> > >  	unsigned long flags;
> > >  	unsigned long mask;
> > > -	bool needwake = false;
> > > -	const bool offloaded = rcu_rdp_is_offloaded(rdp);
> > > +	bool offloaded, needwake = false;
> > >  	struct rcu_node *rnp;
> > >  
> > >  	WARN_ON_ONCE(rdp->cpu != smp_processor_id());
> > >  	rnp = rdp->mynode;
> > >  	raw_spin_lock_irqsave_rcu_node(rnp, flags);
> > > +	offloaded = rcu_rdp_is_offloaded(rdp);
> > >  	if (rdp->cpu_no_qs.b.norm || rdp->gp_seq != rnp->gp_seq ||
> > >  	    rdp->gpwrap) {
> > 
> > BTW Paul, if we happen to switch to non-NOCB (deoffload) some time after
> > rcu_report_qs_rdp(), it's possible that the rcu_accelerate_cbs()
> > that was supposed to be handled by nocb kthreads on behalf of
> > rcu_core() -> rcu_report_qs_rdp() would not happen. At least not until
> > we invoke rcu_core() again. Not sure how much harm that could cause.
> 
> Again, can we just invoke rcu_core() as soon as this is noticed?

Right. So I'm going to do things a bit differently. I'm going to add
a new segcblist state flag so that during the deoffloading process,
the first very step is an invoke_rcu_core() on the target after setting a
flag that requires handling all this things: accelerate/do_batch, etc...

Then will remain the "do we still have pending callbacks after do_batch?"
in which case we'll need to invoke the rcu_core again as long as we are in
the middle of deoffloading.

Ok, now to write the patches.