[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171005134042.ccftrgkay7pvbw3g@hirez.programming.kicks-ass.net>
Date: Thu, 5 Oct 2017 15:40:42 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, mingo@...nel.org,
jiangshanlai@...il.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, tglx@...utronix.de, dhowells@...hat.com,
edumazet@...gle.com, fweisbec@...il.com, oleg@...hat.com
Subject: Re: [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of
migrations and delays
On Thu, Oct 05, 2017 at 09:17:03AM -0400, Steven Rostedt wrote:
> On Wed, 4 Oct 2017 14:29:27 -0700
> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
>
> > Consider the following admittedly improbable sequence of events:
> >
> > o RCU is initially idle.
> >
> > o Task A on CPU 0 executes rcu_read_lock().
>
> A starts rcu_read_lock() critical section.
>
> >
> > o Task B on CPU 1 executes synchronize_rcu(), which must
> > wait on Task A:
>
> B waits for A.
>
> >
> > o Task B registers the callback, which starts a new
> > grace period, awakening the grace-period kthread
> > on CPU 3, which immediately starts a new grace period.
>
> [ isn't B blocked (off rq)? How does it migrate? ]
No, its running synchronize_rcu() but hasn't blocked yet. It would block
on wait_for_completion(), but per the very last point, we'll observe the
complete() before we block.
> > o Task B migrates to CPU 2, which provides a quiescent
> > state for both CPUs 1 and 2.
> >
> > o Both CPUs 1 and 2 take scheduling-clock interrupts,
> > and both invoke RCU_SOFTIRQ, both thus learning of the
> > new grace period.
> >
> > o Task B is delayed, perhaps by vCPU preemption on CPU 2.
> >
> > o CPUs 2 and 3 pass through quiescent states, which are reported
> > to core RCU.
> >
> > o Task B is resumed just long enough to be migrated to CPU 3,
> > and then is once again delayed.
> >
> > o Task A executes rcu_read_unlock(), exiting its RCU read-side
> > critical section.
>
> A calls rcu_read_unlock() ending the critical section
The point is that rcu_read_unlock() doesn't have memory ordering.
> >
> > o CPU 0 passes through a quiescent sate, which is reported to
> > core RCU. Only CPU 1 continues to block the grace period.
> >
> > o CPU 1 passes through a quiescent state, which is reported to
> > core RCU. This ends the grace period, and CPU 1 therefore
> > invokes its callbacks, one of which awakens Task B via
> > complete().
> >
> > o Task B resumes (still on CPU 3) and starts executing
> > wait_for_completion(), which sees that the completion has
> > already completed, and thus does not block. It returns from
> > the synchronize_rcu() without any ordering against the
> > end of Task A's RCU read-side critical section.
>
> B runs
>
>
> >
> > It can therefore mess up Task A's RCU read-side critical section,
> > in theory, anyway.
>
> I don't see how B ran during A's critical section.
It didn't but that doesn't mean the memory ordering agrees. What we
require is B observes (per the memory ordering) everything up to and
including the rcu_read_unlock(). This is not 'time' related.
That said, I don't think it can actually happen, because CPU0's QS state
is ordered against the complete and the wait_for_completion is ordered
against the complete.
Powered by blists - more mailing lists