[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BLU0-SMTP67AEF22F46362646A365A596680@phx.gbl>
Date: Tue, 14 Jun 2011 08:50:35 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Lai Jiangshan <laijs@...fujitsu.com>
CC: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
josh@...htriplett.org, Manfred Spraul <manfred@...orfullife.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] rcu,doc: lock-free update site
* Lai Jiangshan (laijs@...fujitsu.com) wrote:
> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
> update site.
>
[...]
> @@ -0,0 +1,143 @@
> +Lock-free(lockless) update site
> +
> +This article describes a pattern of using RCU to implement lock-free(lockless)
> +update site. RCU update site is considered call-rare and it is protected
> +by a update-site lock generally. But blocking algorithms are undesirable
> +in some cases for some reasons, thus, this pattern may help.
Hi Lai,
Yes, using this kind of rcu read-side lock to protect against the
cmpxchg ABA problem is well-known (to me at least) ;) I used this
technique in the userspace RCU library "lock-free queue" and "lock-free
stack" in 2010*. Please feel free to dig through my RCU data containers code
to bring in more data structure examples:
http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
Thanks!
Mathieu
* AFAIK I introduced this technique using RCU read-side C.S. to deal
with cmpxchg ABA at that point, but someone might have thought about
it before me without my knowledge. My litterature survey so far
indicates that using a double-word CAS on a pointer/counter was one of
the usual technique used to protect against cmpxchg ABA so far. Other
techniques imply allocating elements in a limited-size array (so a
simple cmpxchg can update the array index and counter atomically),
Hasard Pointers, or having a full-blown GC which provides similar
guarantees to the RCU grace period with a read-side lock held.
Ref.:
[1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
[2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
[2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"
> +
> +This pattern can only protect a single pointer which is the only reference
> +of the object.
> +
> +object pointer:
> +
> +struct my_struct *gptr;
> +
> +wait-free read site:
> +{
> + rcu_read_lock();
> + ptr = rcu_dereference(gptr);
> + my_struct_read(ptr);
> + rcu_read_unlock();
> +}
> +
> +lock-free update site(update as new):
> +{
> + new_ptr = my_struct_alloc();
> + for (;;) {
> + rcu_read_lock();
> +
> + old_ptr = rcu_dereference(gptr);
> +
> + /* copy data from old_ptr to new_ptr and update it */
> + my_struct_update(new_ptr, old_ptr);
> +
> + /* atomically publish the new_ptr and de-publish the old_ptr */
> + if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
> + rcu_read_unlock();
> +
> + /*
> + * free it after a grace-period, read sites and other
> + * update sites may be reading it in parallel.
> + */
> + kfree_rcu(old_ptr);
> +
> + /* success, exit the loop */
> + break;
> + } else {
> + rcu_read_unlock();
> +
> + /*
> + * Other update site successfully update it, we need
> + * to read the latest data and try the update again.
> + *
> + * If the other update site did the same thing we need,
> + * we can free the new_ptr and exit this loop too,
> + * and it may becomes a wait-free algorithm.
> + */
> + }
> + }
> +}
> +
> +1) In update site, rcu_read_lock() is needed for my_struct_update().
> +
> + In this kind of lock-free update site, many update sides
> + may run parallel, other update side may had successfully
> + de-published old_ptr and tried to free it. rcu_read_lock()
> + prevents old_ptr from freeing and ensures it valid for
> + my_struct_update().
> +
> +2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
> +
> + Although the content of old_ptr is not accessed when cmpxchg(),
> + but old_ptr should not be freed until cmpxchg() finished.
> + Otherwise we may miss other successful update and publish a
> + new_ptr without information from the latest object.
> +
> + Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
> + (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
> +
> + CPU0 CPU1
> + rcu_read_lock()
> + old_ptr = rcu_dereference(gptr);
> + my_struct_update(new_ptr, old_ptr);
> + rcu_read_unlock();
> + . successfully update, now gptr=other_ptr
> + . old_ptr is freed
> + .
> + . other update, my_struct_alloc() returns old_ptr
> + . successfully publish and de-publish
> + . now gptr=old_ptr again
> + .
> + cmpxchg(&gptr, old_ptr, new_ptr)
> + cmpxchg() success, but the 2 updates
> + of CPU1 are completely missed.
> +
> + This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
> + before cmpxchg() finished and to prevent ABA-problem.
> +
> +3) Beware NULL pointer.
> +
> + Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
> +
> +lock-free update site(dispose, wait-free):
> +{
> + old_ptr = xchg(&gptr, NULL);
> + if (old_ptr != NULL)
> + kfree_rcu(old_ptr);
> +}
> +
> + This code cause NULL reusing and may cause ABA-problem like above example:
> +
> + CPU0 CPU1
> + rcu_read_lock()
> + old_ptr = rcu_dereference(gptr);
> + /* old_ptr = NULL */
> + my_struct_update(new_ptr, NULL);
> + . successfully update, now gptr=other_ptr
> + .
> + . successfully dispose
> + . now gptr=NULL again
> + .
> + cmpxchg(&gptr, NULL, new_ptr)
> + cmpxchg() success, but the update
> + and the dispose of CPU1 are missed
> + consideration by CPU0.
> + rcu_read_unlock();
> +
> + In many use cases, these behaviors are OK. In these use cases,
> + my_struct_update(new_ptr, NULL) give us the same result even we retry.
> +
> + But in some raw use cases(I can't find any use-case now, I believe it exist),
> + the missed considerations of the updates are not acceptable, in this case,
> + we should use different null-value for NULL pointer for every disposing.
> +
> +lock-free update site(dispose, wait-free, paranoid version):
> +{
> + null_ptr = alloc_null_ptr();
> + old_ptr = xchg(&gptr, null_ptr);
> + if (is_null_ptr(old_ptr))
> + free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
> + else
> + kfree_rcu(old_ptr);
> +}
> +
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists