lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 20 May 2018 22:48:26 -0700
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Randy Dunlap <rdunlap@...radead.org>
Cc:     Joel Fernandes <joelaf@...gle.com>, linux-kernel@...r.kernel.org,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Josh Triplett <josh@...htriplett.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>, byungchul.park@....com,
        kernel-team@...roid.com
Subject: Re: [PATCH v3 1/4] rcu: Add comment documenting how rcu_seq_snap
 works

On Sun, May 20, 2018 at 09:50:25PM -0700, Randy Dunlap wrote:
> On 05/20/2018 09:42 PM, Joel Fernandes wrote:
> > rcu_seq_snap may be tricky to decipher. Lets document how it works with
> > an example to make it easier.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > ---
> >  kernel/rcu/rcu.h | 33 ++++++++++++++++++++++++++++++++-
> >  1 file changed, 32 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index 0453a7d12b3f..d4396c96f614 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -91,7 +91,38 @@ static inline void rcu_seq_end(unsigned long *sp)
> >  	WRITE_ONCE(*sp, rcu_seq_endval(sp));
> >  }
> >  
> > -/* Take a snapshot of the update side's sequence number. */
> > +/*
> > + * rcu_seq_snap - Take a snapshot of the update side's sequence number.
> > + *
> > + * This function returns the earliest value of the grace-period sequence number
> > + * that will indicate that a full grace period has elapsed since the current
> > + * time.  Once the grace-period sequence number has reached this value, it will
> > + * be safe to invoke all callbacks that have been registered prior to the
> > + * current time. This value is the current grace-period number plus two to the
> > + * power of the number of low-order bits reserved for state, then rounded up to
> > + * the next value in which the state bits are all zero.
> > + *
> > + * For example, since RCU_SEQ_STATE_MASK=3 and the least significant bit of
> > + * the seq is used to track if a GP is in progress or not, its sufficient if we
> 
>                                                               it's
> 
> > + * add (6+1) and mask with ~3 to get the next GP. Let's see why with an example:
> > + *
> > + * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
> > + * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
> > + * to account for the shift due to 2 state bits. Now, if the current seq is
> > + * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
> > + * is already in progress so the next GP that a future call back will be queued
> > + * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
> > + * the 2 lower bits by adding 0b11. Incase the lower bit was set, the overflow
> 
>                                        In case
> 
> > + * will cause the extra +1 to the GP, along with the usual +1 explained before.
> > + * This gives us GP+2. Finally we mask the lower to bits by ~0x3 incase the
> 
>                                                                     in case
> 
> > + * overflow didn't occur. This masking is needed because incase RCU was idle
> 
>                                                             in case
> 
> > + * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
> > + * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
> > + *
> > + * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
> > + * which can be generalized to:
> > + * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
> > + */
> >  static inline unsigned long rcu_seq_snap(unsigned long *sp)
> >  {
> >  	unsigned long s;
> > 
> 
> cheers.
> -- 
> ~Randy

Thanks Randy. Fixed, updated patch below. Paul, let me know if you want
me to send it separately or if you can pick it up from below.

Also I realize I need some better automated tools to catch these issues
(spelling errors in commit, diffs etc). Probably checkpatch.pl should
have such checks for these common things too.

----------8<----------

>From 1c1f8ce04bca656a3c07e555048545d4a59e44cf Mon Sep 17 00:00:00 2001
From: Joel Fernandes <joel@...lfernandes.org>
Date: Sun, 20 May 2018 19:37:18 -0700
Subject: [PATCH v3.5] rcu: Add comment documenting how rcu_seq_snap works

rcu_seq_snap may be tricky to decipher. Lets document how it works with
an example to make it easier.

Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
---
 kernel/rcu/rcu.h | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 0453a7d12b3f..00df3da98317 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -91,7 +91,38 @@ static inline void rcu_seq_end(unsigned long *sp)
 	WRITE_ONCE(*sp, rcu_seq_endval(sp));
 }
 
-/* Take a snapshot of the update side's sequence number. */
+/*
+ * rcu_seq_snap - Take a snapshot of the update side's sequence number.
+ *
+ * This function returns the earliest value of the grace-period sequence number
+ * that will indicate that a full grace period has elapsed since the current
+ * time.  Once the grace-period sequence number has reached this value, it will
+ * be safe to invoke all callbacks that have been registered prior to the
+ * current time. This value is the current grace-period number plus two to the
+ * power of the number of low-order bits reserved for state, then rounded up to
+ * the next value in which the state bits are all zero.
+ *
+ * For example, since RCU_SEQ_STATE_MASK=3 and the least significant bit of
+ * the seq is used to track if a GP is in progress or not, its sufficient if we
+ * add (6+1) and mask with ~3 to get the next GP. Let's see why with an example:
+ *
+ * Say the current seq is 12 which is 0b1100 (GP is 3 and state bits are 0b00).
+ * To get to the next GP number of 4, we have to add 0b100 to this (0x1 << 2)
+ * to account for the shift due to 2 state bits. Now, if the current seq is
+ * 13 (GP is 3 and state bits are 0b01), then it means the current grace period
+ * is already in progress so the next GP that a future call back will be queued
+ * to run at is GP+2 = 5, not 4. To account for the extra +1, we just overflow
+ * the 2 lower bits by adding 0b11. In case the lower bit was set, the overflow
+ * will cause the extra +1 to the GP, along with the usual +1 explained before.
+ * This gives us GP+2. Finally we mask the lower to bits by ~0x3 in case the
+ * overflow didn't occur. This masking is needed because in case RCU was idle
+ * (no GP in progress so lower 2 bits are 0b00), then the overflow of the lower
+ * 2 state bits wouldn't occur, so we mask to zero out those lower 2 bits.
+ *
+ * In other words, the next seq can be obtained by (0b11 + 0b100) & (~0b11)
+ * which can be generalized to:
+ * seq + (RCU_SEQ_STATE_MASK + (RCU_SEQ_STATE_MASK + 1)) & (~RCU_SEQ_STATE_MASK)
+ */
 static inline unsigned long rcu_seq_snap(unsigned long *sp)
 {
 	unsigned long s;
-- 
2.17.0.441.gb46fe60e1d-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ