[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20161110045213.GY4127@linux.vnet.ibm.com>
Date: Wed, 9 Nov 2016 20:52:13 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Chris Metcalf <cmetcalf@...lanox.com>,
Gilad Ben Yossef <giladb@...lanox.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>, Tejun Heo <tj@...nel.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Lameter <cl@...ux.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Francis Giraldeau <francis.giraldeau@...il.com>,
Andi Kleen <andi@...stfloor.org>,
Arnd Bergmann <arnd@...db.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: task isolation discussion at Linux Plumbers
On Wed, Nov 09, 2016 at 05:44:02PM -0800, Andy Lutomirski wrote:
> On Wed, Nov 9, 2016 at 9:38 AM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
>
> Are you planning on changing rcu_nmi_enter()? It would make it easier
> to figure out how they interact if I could see the code.
It already calls rcu_dynticks_eqs_exit(), courtesy of the earlier
consolidation patches.
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index dbf20b058f48..342c8ee402d6 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
>
>
> > /*
> > @@ -305,17 +318,22 @@ static void rcu_dynticks_eqs_enter(void)
> > static void rcu_dynticks_eqs_exit(void)
> > {
> > struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > + int seq;
> >
> > /*
> > - * CPUs seeing atomic_inc() must see prior idle sojourns,
> > + * CPUs seeing atomic_inc_return() must see prior idle sojourns,
> > * and we also must force ordering with the next RCU read-side
> > * critical section.
> > */
> > - smp_mb__before_atomic(); /* See above. */
> > - atomic_inc(&rdtp->dynticks);
> > - smp_mb__after_atomic(); /* See above. */
> > + seq = atomic_inc_return(&rdtp->dynticks);
> > WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> > - !(atomic_read(&rdtp->dynticks) & 0x1));
> > + !(seq & RCU_DYNTICK_CTRL_CTR));
>
> I think there's still a race here. Suppose we're running this code on
> cpu n and...
>
> > + if (seq & RCU_DYNTICK_CTRL_MASK) {
> > + rcu_eqs_special_exit();
> > + /* Prefer duplicate flushes to losing a flush. */
> > + smp_mb__before_atomic(); /* NMI safety. */
>
> ... another CPU changes the page tables and calls rcu_eqs_special_set(n) here.
But then rcu_eqs_special_set() will return false because we already
exited the extended quiescent state at the atomic_inc_return() above.
That should tell the caller to send an IPI.
> That CPU expects that we will flush prior to continuing, but we won't.
> Admittedly it's highly unlikely that any stale TLB entries would be
> created yet, but nothing rules it out.
That said, 0day is having some heartburn from this, so I must have broken
something somewhere. My own tests of course complete just fine...
> > + atomic_and(~RCU_DYNTICK_CTRL_MASK, &rdtp->dynticks);
> > + }
>
> Maybe the way to handle it is something like:
>
> this_cpu_write(rcu_nmi_needs_eqs_special, 1);
> barrier();
>
> /* NMI here will call rcu_eqs_special_exit() regardless of the value
> in dynticks */
>
> atomic_and(...);
> smp_mb__after_atomic();
> rcu_eqs_special_exit();
>
> barrier();
> this_cpu_write(rcu_nmi_needs_eqs_special, 0);
>
>
> Then rcu_nmi_enter() would call rcu_eqs_special_exit() if the dynticks
> bit is set *or* rcu_nmi_needs_eqs_special is set.
>
> Does that make sense?
I believe that rcu_eqs_special_set() returning false covers this, but
could easily be missing something.
Thanx, Paul
Powered by blists - more mailing lists