[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <922177.2692.qm@web83824.mail.sp1.yahoo.com>
Date: Tue, 20 Nov 2007 19:43:09 -0800 (PST)
From: James Huang <jamesclhuang@...oo.com>
To: linux-kernel@...r.kernel.org
Cc: jamesclhuang@...oo.com
Subject: Re: __rcu_process_callbacks() in Linux 2.6
Please disregard the previous email.
In the latest Linux 2.6 RCU implementation, __rcu_process_callbacks() is coded as follows:
422 static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp,
423 struct rcu_data *rdp)
424 {
425 if (rdp->curlist && !rcu_batch_before(rcp->completed, rdp->batch)) {
426 *rdp->donetail = rdp->curlist;
427 rdp->donetail = rdp->curtail;
428 rdp->curlist = NULL;
429 rdp->curtail = &rdp->curlist;
430 }
431
432 if (rdp->nxtlist && !rdp->curlist) {
433 local_irq_disable();
434 rdp->curlist = rdp->nxtlist;
435 rdp->curtail = rdp->nxttail;
436 rdp->nxtlist = NULL;
437 rdp->nxttail = &rdp->nxtlist;
438 local_irq_enable();
439
440 /*
441 * start the next batch of callbacks
442 */
443
444 /* determine batch number */
445 rdp->batch = rcp->cur + 1;
446 /* see the comment and corresponding wmb() in
447 * the rcu_start_batch()
448 */
449 smp_rmb();
450
451 if (!rcp->next_pending) {
452 /* and start it/schedule start if it's a new batch */
453 spin_lock(&rcp->lock);
454 rcp->next_pending = 1;
455 rcu_start_batch(rcp);
456 spin_unlock(&rcp->lock);
457 }
458 }
459
460 rcu_check_quiescent_state(rcp, rdp);
461 if (rdp->donelist)
462 rcu_do_batch(rdp);
463 }
The question is how does the update of rdp->batch at line 445 guarantee to observe the most updated value of rcp->cur??
The issue is that there is no memory barrier/locking before line 445.
So I think the following sequence of events in chronological order is possible:
Assume initially rcp->cur = 100, this current batch value is visible to every CPU, and batch 100 has completed
(i.e. rcp->completed = 100, rcp->next_pending = 0, rcp->cpumask = 0, and for each CPU, rdp->quiescbatch = 100, rdp->qs_pending = 0, rdp->passed_quiesc = 1)
CPU 0:
---------
call_rcu(): a callback inserted into rdp->nxtlist;
timer interrupt
call rcu_pending(), return true ( ! rdp->curlist && rdp->nxtlist)
call rcu_check_callbacks()
schedule per CPU rcu_tasklet
__rcu_process_callbacks()
move callbacks from nxtlist to curlist;
rdp->batch = 101
lock rcp->lock
rcp->next_pending = 1
call rcu_start_batch()
find the current batch has completed and next batch pending;
rcp->next_pending = 0
update rcp->cur to 101 and initialize rcp->cpumask; <----- time t1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unlock rcp->lock
CPU 1:
---------
timer interrupt
call rcu_pending(), return true (asume observing rcp->cur = 101 != rdp->quiescbatch)
call rcu_check_callbacks()
schedule per CPU rcu_tasklet
__rcu_process_callbacks()
call rcu_check_quisecent_state()
find rdp->quiescbatch != rcp->cur
set rdp->qs_pending = 1
set rdp->passed_quiesc = 0
set rdp->quiescbatch = 101 (rcp->cur)
Another timer interrupt
call rcu_pending(), return true (rdp->qs_pending == 1)
call rcu_check_callbacks()
(assume in user mode) <-- time t2 pass quiescent state
~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
rdp->passed_quiesc = 1
schedule per CPU rcu_tasklet
__rcu_process_callbacks()
call rcu_check_quisecent_state()
find rdp->qs_pending == 1 && rdp-> passed_quiesc == 1
set rdp->qs_pending = 0
lock rcp->lock
call cpu_quite()
clear bit in the rcp->cpumask set up by CPU 0 at time t1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
unlock rcp->lock
CPU 2:
---------
call_rcu(): a callback inserted into rdp->nxtlist; <--- time t3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
timer interrupt
call rcu_pending(), return true ( ! rdp->curlist && rdp->nxtlist)
call rcu_check_callbacks()
schedule per CPU rcu_tasklet
calls __rcu_process_callbacks()
move callbacks from nxtlist to curlist; (including the callback inserted at time t3)
rdp->batch = rcp->cur + 1; <--- time t4
~~~~~~~~~~~~~~~~~~~~
At time t4, CPU 2 might observe rcp->cpu as 100 and set rdp->batch to 101.
So CPU 2 essentially started its batch 101 (includes a callback inserted at time t3) after CPU 1 passed its quiescent state (at time t2) for batch 101.
Isn't this an issue??? Am I missing something?
Thanks for pointing me to the right direction.
-- James Huang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists