[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191004145741.118292-1-joel@joelfernandes.org>
Date: Fri, 4 Oct 2019 10:57:41 -0400
From: "Joel Fernandes (Google)" <joel@...lfernandes.org>
To: linux-kernel@...r.kernel.org
Cc: Joel Fernandes <joel@...lfernandes.org>, bristot@...hat.com,
peterz@...radead.org, oleg@...hat.com, paulmck@...nel.org,
rcu@...r.kernel.org, Josh Triplett <josh@...htriplett.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: [PATCH] Remove GP_REPLAY state from rcu_sync
From: Joel Fernandes <joel@...lfernandes.org>
Please consider this is an RFC for discussion only. Just want to discuss
why the GP_REPLAY state is needed at all.
Here's the intention AFAICS:
When rcu_sync_exit() has happened, the gp_state changes to GP_EXIT while
we wait for a grace period before transitioning to GP_IDLE. In the
meanwhile, if we receive another rcu_sync_exit(), then we want to wait
for another GP to account for that.
Drawing some timing diagrams, it looks like this:
Legend:
rse = rcu_sync_enter
rsx = rcu_sync_exit
i = GP_IDLE
x = GP_EXIT
r = GP_REPLAY
e = GP_ENTER
p = GP_PASSED
rx = GP_REPLAY changes to GP_EXIT
GP num = The GP we are one.
note: A GP passes between the states:
e and p
x and i
x and rx
rx and i
In a simple case, the timing and states look like:
time
---------------------->
GP num 1111111 2222222
GP state i e p x i
CPU0 : rse rsx
However we can enter the replay state like this:
time
---------------------->
GP num 1111111 2222222222222222222223333333
GP state i e p x r rx i
CPU0 : rse rsx
CPU1 : rse rsx
Due to the second rse + rsx, we had to wait for another GP.
I believe the rationale is, if another rsx happens, another GP has to
happen.
But this is not always true if you consider the following events:
time
---------------------->
GP num 111111 22222222222222222222222222222222233333333
GP state i e p x r rx i
CPU0 : rse rsx
CPU1 : rse rsx
CPU2 : rse rsx
Here, we had 3 grace periods that elapsed, 1 for the rcu_sync_enter(),
and 2 for the rcu_sync_exit(s).
However, we had 3 rcu_sync_exit()s, not 2. In other words, the
rcu_sync_exit() got batched.
So my point here is, rcu_sync_exit() does not really always cause a new
GP to happen and we can end up having the rcu_sync_exit()s as batched
and sharing the same grace period.
Then what is the point of the GP_REPLAY state at all if it does not
always wait for a new GP? Taking a step back, why did we intend to have
to wait for a new GP if another rcu_sync_exit() comes while one is still
in progress?
Cc: bristot@...hat.com
Cc: peterz@...radead.org
Cc: oleg@...hat.com
Cc: paulmck@...nel.org
Cc: rcu@...r.kernel.org
Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
---
kernel/rcu/sync.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index d4558ab7a07d..4f3aad67992c 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -10,7 +10,7 @@
#include <linux/rcu_sync.h>
#include <linux/sched.h>
-enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT, GP_REPLAY };
+enum { GP_IDLE = 0, GP_ENTER, GP_PASSED, GP_EXIT };
#define rss_lock gp_wait.lock
@@ -85,13 +85,6 @@ static void rcu_sync_func(struct rcu_head *rhp)
*/
WRITE_ONCE(rsp->gp_state, GP_PASSED);
wake_up_locked(&rsp->gp_wait);
- } else if (rsp->gp_state == GP_REPLAY) {
- /*
- * A new rcu_sync_exit() has happened; requeue the callback to
- * catch a later GP.
- */
- WRITE_ONCE(rsp->gp_state, GP_EXIT);
- rcu_sync_call(rsp);
} else {
/*
* We're at least a GP after the last rcu_sync_exit(); eveybody
@@ -167,16 +160,13 @@ void rcu_sync_enter(struct rcu_sync *rsp)
*/
void rcu_sync_exit(struct rcu_sync *rsp)
{
- WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
- WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
+ WARN_ON_ONCE(READ_ONCE(rsp->gp_state) < GP_PASSED);
spin_lock_irq(&rsp->rss_lock);
if (!--rsp->gp_count) {
if (rsp->gp_state == GP_PASSED) {
WRITE_ONCE(rsp->gp_state, GP_EXIT);
rcu_sync_call(rsp);
- } else if (rsp->gp_state == GP_EXIT) {
- WRITE_ONCE(rsp->gp_state, GP_REPLAY);
}
}
spin_unlock_irq(&rsp->rss_lock);
--
2.23.0.581.g78d2f28ef7-goog
Powered by blists - more mailing lists