[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180626000859.23572-29-paulmck@linux.vnet.ibm.com>
Date: Mon, 25 Jun 2018 17:08:48 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...nel.org, jiangshanlai@...il.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
fweisbec@...il.com, oleg@...hat.com, joel@...lfernandes.org,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: [PATCH tip/core/rcu 29/40] doc: Update data-structure documentation for ->gp_seq
Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
---
.../Data-Structures/Data-Structures.html | 118 ++++++++++--------
1 file changed, 63 insertions(+), 55 deletions(-)
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
index 6c06e10bd04b..f5120a00f511 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
@@ -380,31 +380,26 @@ and therefore need no protection.
as follows:
<pre>
- 1 unsigned long gpnum;
- 2 unsigned long completed;
+ 1 unsigned long gp_seq;
</pre>
<p>RCU grace periods are numbered, and
-the <tt>->gpnum</tt> field contains the number of the grace
-period that started most recently.
-The <tt>->completed</tt> field contains the number of the
-grace period that completed most recently.
-If the two fields are equal, the RCU grace period that most recently
-started has already completed, and therefore the corresponding
-flavor of RCU is idle.
-If <tt>->gpnum</tt> is one greater than <tt>->completed</tt>,
-then <tt>->gpnum</tt> gives the number of the current RCU
-grace period, which has not yet completed.
-Any other combination of values indicates that something is broken.
-These two fields are protected by the root <tt>rcu_node</tt>'s
+the <tt>->gp_seq</tt> field contains the current grace-period
+sequence number.
+The bottom two bits are the state of the current grace period,
+which can be zero for not yet started or one for in progress.
+In other words, if the bottom two bits of <tt>->gp_seq</tt> are
+zero, the corresponding flavor of RCU is idle.
+Any other value in the bottom two bits indicates that something is broken.
+This field is protected by the root <tt>rcu_node</tt> structure's
<tt>->lock</tt> field.
-</p><p>There are <tt>->gpnum</tt> and <tt>->completed</tt> fields
+</p><p>There are <tt>->gp_seq</tt> fields
in the <tt>rcu_node</tt> and <tt>rcu_data</tt> structures
as well.
The fields in the <tt>rcu_state</tt> structure represent the
-most current values, and those of the other structures are compared
-in order to detect the start of a new grace period in a distributed
+most current value, and those of the other structures are compared
+in order to detect the beginnings and ends of grace periods in a distributed
fashion.
The values flow from <tt>rcu_state</tt> to <tt>rcu_node</tt>
(down the tree from the root to the leaves) to <tt>rcu_data</tt>.
@@ -512,27 +507,47 @@ than to be heisenbugged out of existence.
as follows:
<pre>
- 1 unsigned long gpnum;
- 2 unsigned long completed;
+ 1 unsigned long gp_seq;
+ 2 unsigned long gp_seq_needed;
</pre>
-<p>These fields are the counterparts of the fields of the same name in
-the <tt>rcu_state</tt> structure.
-They each may lag up to one behind their <tt>rcu_state</tt>
-counterparts.
-If a given <tt>rcu_node</tt> structure's <tt>->gpnum</tt> and
-<tt>->complete</tt> fields are equal, then this <tt>rcu_node</tt>
+<p>The <tt>rcu_node</tt> structures' <tt>->gp_seq</tt> fields are
+the counterparts of the field of the same name in the <tt>rcu_state</tt>
+structure.
+They each may lag up to one step behind their <tt>rcu_state</tt>
+counterpart.
+If the bottom two bits of a given <tt>rcu_node</tt> structure's
+<tt>->gp_seq</tt> field is zero, then this <tt>rcu_node</tt>
structure believes that RCU is idle.
-Otherwise, as with the <tt>rcu_state</tt> structure,
-the <tt>->gpnum</tt> field will be one greater than the
-<tt>->complete</tt> fields, with <tt>->gpnum</tt>
-indicating which grace period this <tt>rcu_node</tt> believes
-is still being waited for.
+</p><p>The <tt>>gp_seq</tt> field of each <tt>rcu_node</tt>
+structure is updated at the beginning and the end
+of each grace period.
+
+<p>The <tt>->gp_seq_needed</tt> fields record the
+furthest-in-the-future grace period request seen by the corresponding
+<tt>rcu_node</tt> structure. The request is considered fulfilled when
+the value of the <tt>->gp_seq</tt> field equals or exceeds that of
+the <tt>->gp_seq_needed</tt> field.
-</p><p>The <tt>>gpnum</tt> field of each <tt>rcu_node</tt>
-structure is updated at the beginning
-of each grace period, and the <tt>->completed</tt> fields are
-updated at the end of each grace period.
+<table>
+<tr><th> </th></tr>
+<tr><th align="left">Quick Quiz:</th></tr>
+<tr><td>
+ Suppose that this <tt>rcu_node</tt> structure doesn't see
+ a request for a very long time.
+ Won't wrapping of the <tt>->gp_seq</tt> field cause
+ problems?
+</td></tr>
+<tr><th align="left">Answer:</th></tr>
+<tr><td bgcolor="#ffffff"><font color="ffffff">
+ No, because if the <tt>->gp_seq_needed</tt> field lags behind the
+ <tt>->gp_seq</tt> field, the <tt>->gp_seq_needed</tt> field
+ will be updated at the end of the grace period.
+ Modulo-arithmetic comparisons therefore will always get the
+ correct answer, even with wrapping.
+</font></td></tr>
+<tr><td> </td></tr>
+</table>
<h5>Quiescent-State Tracking</h5>
@@ -626,9 +641,8 @@ normal and expedited grace periods, respectively.
</ol>
<p><font color="ffffff">So the locking is absolutely required in
- order to coordinate
- clearing of the bits with the grace-period numbers in
- <tt>->gpnum</tt> and <tt>->completed</tt>.
+ order to coordinate clearing of the bits with updating of the
+ grace-period sequence number in <tt>->gp_seq</tt>.
</font></td></tr>
<tr><td> </td></tr>
</table>
@@ -1038,15 +1052,15 @@ out any <tt>rcu_data</tt> structure for which this flag is not set.
as follows:
<pre>
- 1 unsigned long completed;
- 2 unsigned long gpnum;
+ 1 unsigned long gp_seq;
+ 2 unsigned long gp_seq_needed;
3 bool cpu_no_qs;
4 bool core_needs_qs;
5 bool gpwrap;
6 unsigned long rcu_qs_ctr_snap;
</pre>
-<p>The <tt>completed</tt> and <tt>gpnum</tt>
+<p>The <tt>->gp_seq</tt> and <tt>->gp_seq_needed</tt>
fields are the counterparts of the fields of the same name
in the <tt>rcu_state</tt> and <tt>rcu_node</tt> structures.
They may each lag up to one behind their <tt>rcu_node</tt>
@@ -1054,15 +1068,9 @@ counterparts, but in <tt>CONFIG_NO_HZ_IDLE</tt> and
<tt>CONFIG_NO_HZ_FULL</tt> kernels can lag
arbitrarily far behind for CPUs in dyntick-idle mode (but these counters
will catch up upon exit from dyntick-idle mode).
-If a given <tt>rcu_data</tt> structure's <tt>->gpnum</tt> and
-<tt>->complete</tt> fields are equal, then this <tt>rcu_data</tt>
+If the lower two bits of a given <tt>rcu_data</tt> structure's
+<tt>->gp_seq</tt> are zero, then this <tt>rcu_data</tt>
structure believes that RCU is idle.
-Otherwise, as with the <tt>rcu_state</tt> and <tt>rcu_node</tt>
-structure,
-the <tt>->gpnum</tt> field will be one greater than the
-<tt>->complete</tt> fields, with <tt>->gpnum</tt>
-indicating which grace period this <tt>rcu_data</tt> believes
-is still being waited for.
<table>
<tr><th> </th></tr>
@@ -1070,13 +1078,13 @@ is still being waited for.
<tr><td>
All this replication of the grace period numbers can only cause
massive confusion.
- Why not just keep a global pair of counters and be done with it???
+ Why not just keep a global sequence number and be done with it???
</td></tr>
<tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff">
- Because if there was only a single global pair of grace-period
+ Because if there was only a single global sequence
numbers, there would need to be a single global lock to allow
- safely accessing and updating them.
+ safely accessing and updating it.
And if we are not going to have a single global lock, we need
to carefully manage the numbers on a per-node basis.
Recall from the answer to a previous Quick Quiz that the consequences
@@ -1091,8 +1099,8 @@ CPU has not yet passed through a quiescent state,
while the <tt>->core_needs_qs</tt> flag indicates that the
RCU core needs a quiescent state from the corresponding CPU.
The <tt>->gpwrap</tt> field indicates that the corresponding
-CPU has remained idle for so long that the <tt>completed</tt>
-and <tt>gpnum</tt> counters are in danger of overflow, which
+CPU has remained idle for so long that the
+<tt>gp_seq</tt> counter is in danger of overflow, which
will cause the CPU to disregard the values of its counters on
its next exit from idle.
Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect
@@ -1130,10 +1138,10 @@ The CPU advances the callbacks in its <tt>rcu_data</tt> structure
whenever it notices that another RCU grace period has completed.
The CPU detects the completion of an RCU grace period by noticing
that the value of its <tt>rcu_data</tt> structure's
-<tt>->completed</tt> field differs from that of its leaf
+<tt>->gp_seq</tt> field differs from that of its leaf
<tt>rcu_node</tt> structure.
Recall that each <tt>rcu_node</tt> structure's
-<tt>->completed</tt> field is updated at the end of each
+<tt>->gp_seq</tt> field is updated at the beginnings and ends of each
grace period.
<p>
--
2.17.1
Powered by blists - more mailing lists