[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1507152575-11055-1-git-send-email-paulmck@linux.vnet.ibm.com>
Date: Wed, 4 Oct 2017 14:29:27 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...nel.org, jiangshanlai@...il.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
fweisbec@...il.com, oleg@...hat.com,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: [PATCH tip/core/rcu 1/9] rcu: Provide GP ordering in face of migrations and delays
Consider the following admittedly improbable sequence of events:
o RCU is initially idle.
o Task A on CPU 0 executes rcu_read_lock().
o Task B on CPU 1 executes synchronize_rcu(), which must
wait on Task A:
o Task B registers the callback, which starts a new
grace period, awakening the grace-period kthread
on CPU 3, which immediately starts a new grace period.
o Task B migrates to CPU 2, which provides a quiescent
state for both CPUs 1 and 2.
o Both CPUs 1 and 2 take scheduling-clock interrupts,
and both invoke RCU_SOFTIRQ, both thus learning of the
new grace period.
o Task B is delayed, perhaps by vCPU preemption on CPU 2.
o CPUs 2 and 3 pass through quiescent states, which are reported
to core RCU.
o Task B is resumed just long enough to be migrated to CPU 3,
and then is once again delayed.
o Task A executes rcu_read_unlock(), exiting its RCU read-side
critical section.
o CPU 0 passes through a quiescent sate, which is reported to
core RCU. Only CPU 1 continues to block the grace period.
o CPU 1 passes through a quiescent state, which is reported to
core RCU. This ends the grace period, and CPU 1 therefore
invokes its callbacks, one of which awakens Task B via
complete().
o Task B resumes (still on CPU 3) and starts executing
wait_for_completion(), which sees that the completion has
already completed, and thus does not block. It returns from
the synchronize_rcu() without any ordering against the
end of Task A's RCU read-side critical section.
It can therefore mess up Task A's RCU read-side critical section,
in theory, anyway.
However, if CPU hotplug ever gets rid of stop_machine(), there will be
more straightforward ways for this sort of thing to happen, so this
commit adds a memory barrier in order to enforce the needed ordering.
Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
---
kernel/rcu/update.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5033b66d2753..9e599fcdd7bf 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -413,6 +413,16 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
wait_for_completion(&rs_array[i].completion);
destroy_rcu_head_on_stack(&rs_array[i].head);
}
+
+ /*
+ * If we migrated after we registered a callback, but before the
+ * corresponding wait_for_completion(), we might now be running
+ * on a CPU that has not yet noticed that the corresponding grace
+ * period has ended. That CPU might not yet be fully ordered
+ * against the completion of the grace period, so the full memory
+ * barrier below enforces that ordering via the completion's state.
+ */
+ smp_mb(); /* ^^^ */
}
EXPORT_SYMBOL_GPL(__wait_rcu_gp);
--
2.5.2
Powered by blists - more mailing lists