linux-kernel - Re: [PATCH v7 6/6] rcu/segcblist: Add additional comments to explain smp

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201019123730.GA34192@lothringen>
Date:   Mon, 19 Oct 2020 14:37:30 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     joel@...lfernandes.org
Cc:     linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Marco Elver <elver@...gle.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>,
        "Uladzislau Rezki (Sony)" <urezki@...il.com>, fweisbec@...il.com,
        neeraj.iitr10@...il.com, stern@...land.harvard.edu
Subject: Re: [PATCH v7 6/6] rcu/segcblist: Add additional comments to explain
 smp_mb()

On Sat, Oct 17, 2020 at 08:35:56PM -0400, joel@...lfernandes.org wrote:
> On Sat, Oct 17, 2020 at 03:29:54PM +0200, Frederic Weisbecker wrote:
> > > C rcubarrier+ctrldep
> > > 
> > > (*
> > >  * Result: Never
> > >  *
> > >  * This litmus test shows that rcu_barrier (P1) prematurely
> > >  * returning by reading len 0 can cause issues if P0 does
> > >  * NOT have a smb_mb() after WRITE_ONCE(len, 1).
> > >  * mod_data == 2 means module was unloaded (so data is garbage).
> > >  *)
> > > 
> > > { int len = 0; int enq = 0; }
> > > 
> > > P0(int *len, int *mod_data, int *enq)
> > > {
> > > 	int r0;
> > > 
> > > 	WRITE_ONCE(*len, 1);
> > > 	smp_mb();		/* Needed! */
> > > 	WRITE_ONCE(*enq, 1);
> > > 
> > > 	r0 = READ_ONCE(*mod_data);
> > > }
> > > 
> > > P1(int *len, int *mod_data, int *enq)
> > > {
> > > 	int r0;
> > > 	int r1;
> > > 
> > > 	r1 = READ_ONCE(*enq);
> > > 
> > > 	// barrier Just for test purpose ("exists" clause) to force the..
> > > 	// ..rcu_barrier() to see enq before len
> > > 	smp_mb();		
> > > 	r0 = READ_ONCE(*len);
> > > 
> > > 	// implicit memory barrier due to conditional */
> > > 	if (r0 == 0)
> > > 		WRITE_ONCE(*mod_data, 2);
> > > }
> > 
> > I'm not sure what scenario P1 refers to in practice, and to what module?
> 
> Kernel module usecase for rcu_barrier. See the docs.

My bad, I'm just reading that documentation now :-s

> > 
> > I'm very likely missing something obvious somewhere.
> > 
> >       CPU 0                                CPU 1
> >       rcu_barrier()                        call_rcu()/rcu_segcblist_enqueue()
> >       ------------                         --------
> > 
> >                                            smp_mb();
> >                                            inc_len();
> > 					   smp_mb();
> > 					   queue callback;
> >       for_each_possible_cpu(cpu)
> >           if (!rcu_segcblist_n_cbs(&rdp->cblist))
> > 	      continue;
> >
> 
> >						invoke_callback
> 
> If CPU 0 saw the enqueue of the callback (that is the CPU 1's writes to the
> segcb_list propagated to CPU 0), then it would have also seen the
> effects of the inc_len. I forced this case in my last litmus test by this
> code in P1():

But then I can't find to which part of rcu_barrier() this refers to.
I see the len read before anything else.

> 
>         r1 = READ_ONCE(*enq);
>         smp_mb();               /* barrier Just for test purpose to show that the.. */
>                                 /* ..rcu_barrier() saw list modification */
> 
> On the other hand, if CPU 0 did not see the enqueue, then there is really no
> issue. Since that is the same case where call_rcu() happened _after_ the
> rcu_barrier() and there's no race. rcu_barrier() does not need to wait if
> there was no callback enqueued.
> 
> This is not exactly the easiest thing to explain, hence the litmus.

Now, reading the documentation of rcu_barrier() (thanks to you!):

    Pseudo-code using rcu_barrier() is as follows:

   1. Prevent any new RCU callbacks from being posted.
   2. Execute rcu_barrier().
   3. Allow the module to be unloaded.


I think with point 1, it is assumed that the caller of rcu_barrier() must have
not only stopped but also sync'ed with the possible enqueuers. Correct me if I'm wrong
here. So for example if a kthread used to post the module RCU callbacks, calling kthread_stop()
does the job as it prevents from further RCU callbacks from being enqueued and it also syncs
with the kthread thanks to the completion implied by any caller of kthread_stop() which then
sees what the kthread has read and written, including RCU callbacks enqueued. So if the caller
of kthread_stop() calls rcu_barrier() right after, rcu_barrier() should see at least the len
corresponding to the last enqueue.

cancel_work_sync() also seem to really sync as well. I'm less sure about del_timer_sync().

Say we have:

expire_timers (CPU 0)                               CPU 1
-------------                                       -----------
detach_timer(timer)
raw_spin_unlock(&base->lock);
call_timer_fn(timer, fn, baseclk);
   -> enqueue callback
//would need at least smp_wmb() here
base->running_timer = NULL;

                                                    del_timer_sync() {
                                                        raw_spin_lock(&base->lock);
                                                        if (base->running_timer != timer)
                                                            ret = detach_if_pending(timer, base, true);
                                                                if (!timer_pending())
                                                                    return 0;
                                                        raw_spin_unlock(&base->lock);
                                                    }
                                                    //would need at least smp_rmb() here
						    //although rcu_seq_start() implies a full barrier
                                                    rcu_barrier() {
						        // Sees rcu_segcblist_n_cbs(rdp(CPU 0)->cblist) == 0
                                                        // So ignore it


But I'm sure I'm missing something obvious. That's my specialism.