[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090212161805.GB6759@linux.vnet.ibm.com>
Date: Thu, 12 Feb 2009 08:18:05 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Mathieu Desnoyers <compudj@...stal.dyndns.org>
Cc: ltt-dev@...ts.casi.polymtl.ca, linux-kernel@...r.kernel.org
Subject: Re: [ltt-dev] [RFC git tree] Userspace RCU (urcu) for Linux
(repost)
On Thu, Feb 12, 2009 at 12:47:07AM -0500, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > On Wed, Feb 11, 2009 at 11:10:44PM -0500, Mathieu Desnoyers wrote:
> > > * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > > > On Wed, Feb 11, 2009 at 06:33:08PM -0800, Paul E. McKenney wrote:
> > > > > On Wed, Feb 11, 2009 at 04:35:49PM -0800, Paul E. McKenney wrote:
> > > > > > On Wed, Feb 11, 2009 at 04:42:58PM -0500, Mathieu Desnoyers wrote:
> > > > > > > * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > > >
> > > > [ . . . ]
> > > >
> > > > > > > > (BTW, I do not trust my model yet, as it currently cannot detect the
> > > > > > > > failure case I pointed out earlier. :-/ Here and I thought that the
> > > > > > > > point of such models was to detect additional failure cases!!!)
> > > > > > > >
> > > > > > >
> > > > > > > Yes, I'll have to dig deeper into it.
> > > > > >
> > > > > > Well, as I said, I attached the current model and the error trail.
> > > > >
> > > > > And I had bugs in my model that allowed the rcu_read_lock() model
> > > > > to nest indefinitely, which overflowed into the top bit, messing
> > > > > things up. :-/
> > > > >
> > > > > Attached is a fixed model. This model validates correctly (woo-hoo!).
> > > > > Even better, gives the expected error if you comment out line 180 and
> > > > > uncomment line 213, this latter corresponding to the error case I called
> > > > > out a few days ago.
> > > > >
> > > > > I will play with removing models of mb...
> > > >
> > > > And commenting out the models of mb between the counter flips and the
> > > > test for readers still passes validation, as expected, and as shown in
> > > > the attached Promela code.
> > > >
> > >
> > > Hrm, in the email I sent you about the memory barrier, I said that it
> > > would not make the algorithm incorrect, but that it would cause
> > > situations where it would be impossible for the writer to do any
> > > progress as long as there are readers active. I think we would have to
> > > enhance the model or at least express this through some LTL statement to
> > > validate this specific behavior.
> >
> > But if the writer fails to make progress, then the counter remains at a
> > given value, which causes readers to drain, which allows the writer to
> > eventually make progress again. Right?
> >
>
> Not necessarily. If we don't have the proper memory barriers, we can
> have the writer waiting on, say, parity 0 *before* it has performed the
> parity switch. Therefore, even newly coming readers will add up to
> parity 0.
But the write that changes the parity will eventually make it out.
OK, so your argument is that we at least need a compiler barrier?
Regardless, please see attached for a modified version of the Promela
model that fully models omitting out the memory barrier that my
rcu_nest32.[hc] implementation omits. (It is possible to partially
model removal of other memory barriers via #if 0, but to fully model
would need to enumerate the permutations as shown on lines 231-257.)
> In your model, this is not detected, because eventually all readers will
> execute, and only then the writer will be able to update the data. But
> in reality, if we run a very busy 4096-cores machines where there is
> always at least one reader active, the the writer will be stuck forever,
> and that's really bad.
Assuming that the reordering is done by the CPU, the write will
eventually get out -- it is stuck in (say) the store buffer, and the
cache line will eventually arrive, and then the value will eventually
be seen by the readers.
We might need a -compiler- barrier, but then again, I am not sure that
we are talking about the same memory barrier -- again, please see
attached lines 231-257 to see which one that I eliminated.
Also, the original model I sent out has a minor bug that prevents it
from fully modeling the nested-read-side case. The patch below fixes this.
Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
---
urcu.spin | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/formal-model/urcu.spin b/formal-model/urcu.spin
index e5bfff3..611464b 100644
--- a/formal-model/urcu.spin
+++ b/formal-model/urcu.spin
@@ -124,9 +124,13 @@ proctype urcu_reader()
break;
:: tmp < 4 && reader_progress[tmp] != 0 ->
tmp = tmp + 1;
- :: tmp >= 4 ->
+ :: tmp >= 4 &&
+ reader_progress[0] == reader_progress[3] ->
done = 1;
break;
+ :: tmp >= 4 &&
+ reader_progress[0] != reader_progress[3] ->
+ break;
od;
do
:: tmp < 4 && reader_progress[tmp] == 0 ->
View attachment "urcu_mbmin.spin" of type "text/plain" (7514 bytes)
Download attachment "urcu_mbmin.sh" of type "application/x-sh" (57 bytes)
Powered by blists - more mailing lists