linux-kernel - Re: [PATCH RFC] srcu: Yet more detail for srcu_readers_active_idx

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221216165144.GA4001@paulmck-ThinkPad-P17-Gen-1>
Date:   Fri, 16 Dec 2022 08:51:44 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Frederic Weisbecker <frederic@...nel.org>, boqun.feng@...il.com,
        neeraj.iitr10@...il.com, urezki@...il.com, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] srcu: Yet more detail for
 srcu_readers_active_idx_check() comments

On Fri, Dec 16, 2022 at 04:32:39PM +0000, Joel Fernandes wrote:
> On Thu, Dec 15, 2022 at 05:09:14PM -0800, Paul E. McKenney wrote:
> [...]
> > > >> 2. unlock()'s smp_mb() happened before Flip+smp_mb() , now the reader
> > > >> has no new smp_mb() that happens AFTER the flip happened. So it can
> > > >> totally sample the old idx again -- that particular reader will
> > > >> increment twice, but the next time, it will see the flipped one.
> > > > 
> > > > I will let you transliterate both.  ;-)
> > > 
> > > I think I see what you mean now :)
> > > 
> > > I believe the access I am referring to is the read of idx on one side and
> > > the write to idx on the other. However that is incomplete and I need to
> > > pair that with some of other access on both sides.
> > > 
> > > So perhaps this:
> > > 
> > > Writer does flip + smp_mb + read unlock counts [1]
> > > 
> > > Reader does:
> > >  read idx + smp_mb() + increment lock counts [2]
> > > 
> > > And subsequently reader does
> > > Smp_mb() + increment unlock count. [3]
> > > 
> > > So [1] races with either [2] or [2]+[3].
> > > 
> > > Is that fair?
> > 
> > That does look much better, thank you!
> 
> Perhaps a comment with an ASCII diagram will help?
> 
> 
> Case 2:
> Both the reader and the updater see each other's writes too late, but because
> of memory barriers on both sides, they will eventually see each other's write
> with respect to their own. This is similar to the store-buffer problem. This
> let's a single reader contribute a maximum (unlock minus lock) imbalance of 2.
> 
> The following diagram shows the subtle worst case followed by a simplified
> store-buffer explanation.
> 
> READER                  UPDATER
> -------------           ----------
>                            // idx is initially 0.
> read_lock() {
>   READ(idx) = 0;
>   lock[0]++; --------------------------------------------,
>                            flip() {                      |               
>                               smp_mb();                  |
>   smp_mb();                                              |
> }                                                        |
>                                                          |
> // RSCS                                                  |
>                                                          |
> read_unlock() {                                          |
>   smp_mb();                                              |
>                               idx++;  // P               |
>                               smp_mb();                  |
>                            }                             |
>                                                          |
>                            scan_readers_idx(0) {         |
>                                count all unlock[0];      |
>                                    |                     |
>                                    |                     |
>   unlock[0]++; //X <--not-counted--`-----,               |
>                                          |               |
> }                                        V               `------,
>                                // Will make sure next scan      |
>                                // will not miss this unlock (X) |
>                                // if other side saw flip (P) ,--`
>                                // Call this MB [1]           |
>                                // Order write(idx) with      |
>                                // next scan's unlock.        |
>                                smp_mb();                 ,---`
> read_lock() {                                            |
>   READ(idx)=0;                                           |
>   lock[0]++; ----------------> count all lock[0];        |
>   smp_mb();         |     }                              |
> }     |             |                                    V
>       |             `---> // Incorrect contribution to lock counting
>       |                   // upto a maximum of 2 times.
>       |
>        `---> // Pairs with MB [1]. Makes sure that
>              // the next read_lock()'s' idx read (Y) is ordered
>              // with above write to unlock[0] (X).
>                             |
> rcu_read_unlock() {         |
>   smp_mb(); <---------------`
>   unlock[0]++; 
> }
> 
> read_lock() {
>   READ(idx) = 1; //Y
>   lock[1]++;
>   ...
> }
>                            scan_readers_idx(0) {
>                                count all unlock[0]; //Q
>                                ...
> 
> 
> thanks,
> 
>  - Joel
> 
>                           }
> 
> This makes it similar to the store buffer pattern. Using X, Y, P and Q
> annotated above, we get:
> 
> READER                    UPDATER
> X (write)                 P (write)
> 
> smp_mb();                 smp_mb();
> 
> Y (read)                  Q (read)

Given that this diagram is more than 50 lines long, it might go better in
a design document describing this part of RCU.  Perhaps less detail or
segmented, but the same general idea as this guy:

Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst

Thoughts?

						Thanx, Paul