[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DF7231B.1080708@cn.fujitsu.com>
Date: Tue, 14 Jun 2011 17:00:11 +0800
From: Lai Jiangshan <laijs@...fujitsu.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
josh@...htriplett.org, Manfred Spraul <manfred@...orfullife.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: [PATCH] rcu,doc: lock-free update site
Add a document which describes a pattern of using RCU to implement lock-free(lockless)
update site.
Singed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
Documentation/RCU/00-INDEX | 2 +
Documentation/RCU/lock-free-update-site.txt | 143 +++++++++++++++++++++++++++
2 files changed, 145 insertions(+), 0 deletions(-)
diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 1d7a885..7178dd5 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -8,6 +8,8 @@ listRCU.txt
- Using RCU to Protect Read-Mostly Linked Lists
lockdep.txt
- RCU and lockdep checking
+lock-free-update-site.txt
+ - RCU pattern of lock-free update site
NMI-RCU.txt
- Using RCU to Protect Dynamic NMI Handlers
rcubarrier.txt
diff --git a/Documentation/RCU/lock-free-update-site.txt b/Documentation/RCU/lock-free-update-site.txt
new file mode 100644
index 0000000..9b6984a
--- /dev/null
+++ b/Documentation/RCU/lock-free-update-site.txt
@@ -0,0 +1,143 @@
+Lock-free(lockless) update site
+
+This article describes a pattern of using RCU to implement lock-free(lockless)
+update site. RCU update site is considered call-rare and it is protected
+by a update-site lock generally. But blocking algorithms are undesirable
+in some cases for some reasons, thus, this pattern may help.
+
+This pattern can only protect a single pointer which is the only reference
+of the object.
+
+object pointer:
+
+struct my_struct *gptr;
+
+wait-free read site:
+{
+ rcu_read_lock();
+ ptr = rcu_dereference(gptr);
+ my_struct_read(ptr);
+ rcu_read_unlock();
+}
+
+lock-free update site(update as new):
+{
+ new_ptr = my_struct_alloc();
+ for (;;) {
+ rcu_read_lock();
+
+ old_ptr = rcu_dereference(gptr);
+
+ /* copy data from old_ptr to new_ptr and update it */
+ my_struct_update(new_ptr, old_ptr);
+
+ /* atomically publish the new_ptr and de-publish the old_ptr */
+ if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
+ rcu_read_unlock();
+
+ /*
+ * free it after a grace-period, read sites and other
+ * update sites may be reading it in parallel.
+ */
+ kfree_rcu(old_ptr);
+
+ /* success, exit the loop */
+ break;
+ } else {
+ rcu_read_unlock();
+
+ /*
+ * Other update site successfully update it, we need
+ * to read the latest data and try the update again.
+ *
+ * If the other update site did the same thing we need,
+ * we can free the new_ptr and exit this loop too,
+ * and it may becomes a wait-free algorithm.
+ */
+ }
+ }
+}
+
+1) In update site, rcu_read_lock() is needed for my_struct_update().
+
+ In this kind of lock-free update site, many update sides
+ may run parallel, other update side may had successfully
+ de-published old_ptr and tried to free it. rcu_read_lock()
+ prevents old_ptr from freeing and ensures it valid for
+ my_struct_update().
+
+2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
+
+ Although the content of old_ptr is not accessed when cmpxchg(),
+ but old_ptr should not be freed until cmpxchg() finished.
+ Otherwise we may miss other successful update and publish a
+ new_ptr without information from the latest object.
+
+ Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
+ (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
+
+ CPU0 CPU1
+ rcu_read_lock()
+ old_ptr = rcu_dereference(gptr);
+ my_struct_update(new_ptr, old_ptr);
+ rcu_read_unlock();
+ . successfully update, now gptr=other_ptr
+ . old_ptr is freed
+ .
+ . other update, my_struct_alloc() returns old_ptr
+ . successfully publish and de-publish
+ . now gptr=old_ptr again
+ .
+ cmpxchg(&gptr, old_ptr, new_ptr)
+ cmpxchg() success, but the 2 updates
+ of CPU1 are completely missed.
+
+ This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
+ before cmpxchg() finished and to prevent ABA-problem.
+
+3) Beware NULL pointer.
+
+ Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
+
+lock-free update site(dispose, wait-free):
+{
+ old_ptr = xchg(&gptr, NULL);
+ if (old_ptr != NULL)
+ kfree_rcu(old_ptr);
+}
+
+ This code cause NULL reusing and may cause ABA-problem like above example:
+
+ CPU0 CPU1
+ rcu_read_lock()
+ old_ptr = rcu_dereference(gptr);
+ /* old_ptr = NULL */
+ my_struct_update(new_ptr, NULL);
+ . successfully update, now gptr=other_ptr
+ .
+ . successfully dispose
+ . now gptr=NULL again
+ .
+ cmpxchg(&gptr, NULL, new_ptr)
+ cmpxchg() success, but the update
+ and the dispose of CPU1 are missed
+ consideration by CPU0.
+ rcu_read_unlock();
+
+ In many use cases, these behaviors are OK. In these use cases,
+ my_struct_update(new_ptr, NULL) give us the same result even we retry.
+
+ But in some raw use cases(I can't find any use-case now, I believe it exist),
+ the missed considerations of the updates are not acceptable, in this case,
+ we should use different null-value for NULL pointer for every disposing.
+
+lock-free update site(dispose, wait-free, paranoid version):
+{
+ null_ptr = alloc_null_ptr();
+ old_ptr = xchg(&gptr, null_ptr);
+ if (is_null_ptr(old_ptr))
+ free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
+ else
+ kfree_rcu(old_ptr);
+}
+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists