linux-kernel - [PATCH] rcu,doc: lock-free update site

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DF7231B.1080708@cn.fujitsu.com>
Date:	Tue, 14 Jun 2011 17:00:11 +0800
From:	Lai Jiangshan <laijs@...fujitsu.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	josh@...htriplett.org, Manfred Spraul <manfred@...orfullife.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: [PATCH] rcu,doc: lock-free update site

Add a document which describes a pattern of using RCU to implement lock-free(lockless)
update site.

Singed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
---
 Documentation/RCU/00-INDEX                  |    2 +
 Documentation/RCU/lock-free-update-site.txt |  143 +++++++++++++++++++++++++++
 2 files changed, 145 insertions(+), 0 deletions(-)

diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 1d7a885..7178dd5 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -8,6 +8,8 @@ listRCU.txt
 	- Using RCU to Protect Read-Mostly Linked Lists
 lockdep.txt
 	- RCU and lockdep checking
+lock-free-update-site.txt
+	- RCU pattern of lock-free update site
 NMI-RCU.txt
 	- Using RCU to Protect Dynamic NMI Handlers
 rcubarrier.txt
diff --git a/Documentation/RCU/lock-free-update-site.txt b/Documentation/RCU/lock-free-update-site.txt
new file mode 100644
index 0000000..9b6984a
--- /dev/null
+++ b/Documentation/RCU/lock-free-update-site.txt
@@ -0,0 +1,143 @@
+Lock-free(lockless) update site
+
+This article describes a pattern of using RCU to implement lock-free(lockless)
+update site. RCU update site is considered call-rare and it is protected
+by a update-site lock generally. But blocking algorithms are undesirable
+in some cases for some reasons, thus, this pattern may help.
+
+This pattern can only protect a single pointer which is the only reference
+of the object.
+
+object pointer:
+
+struct my_struct *gptr;
+
+wait-free read site:
+{
+	rcu_read_lock();
+	ptr = rcu_dereference(gptr);
+	my_struct_read(ptr);
+	rcu_read_unlock();
+}
+
+lock-free update site(update as new):
+{
+	new_ptr = my_struct_alloc();
+	for (;;) {
+		rcu_read_lock();
+
+		old_ptr = rcu_dereference(gptr);
+
+		/* copy data from old_ptr to new_ptr and update it */
+		my_struct_update(new_ptr, old_ptr);
+
+		/* atomically publish the new_ptr and de-publish the old_ptr */
+		if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
+			rcu_read_unlock();
+
+			/*
+			 * free it after a grace-period, read sites and other
+			 * update sites may be reading it in parallel.
+			 */
+			kfree_rcu(old_ptr);
+
+			/* success, exit the loop */
+			break;
+		} else {
+			rcu_read_unlock();
+
+			/*
+			 * Other update site successfully update it, we need
+			 * to read the latest data and try the update again.
+			 *
+			 * If the other update site did the same thing we need,
+			 * we can free the new_ptr and exit this loop too,
+			 * and it may becomes a wait-free algorithm.
+			 */
+		}
+	}
+}
+
+1) In update site, rcu_read_lock() is needed for my_struct_update().
+
+   In this kind of lock-free update site, many update sides
+   may run parallel, other update side may had successfully
+   de-published old_ptr and tried to free it. rcu_read_lock()
+   prevents old_ptr from freeing and ensures it valid for
+   my_struct_update().
+
+2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
+
+   Although the content of old_ptr is not accessed when cmpxchg(),
+   but old_ptr should not be freed until cmpxchg() finished.
+   Otherwise we may miss other successful update and publish a
+   new_ptr without information from the latest object.
+
+   Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
+   (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   my_struct_update(new_ptr, old_ptr);
+   rcu_read_unlock();
+   .						successfully update, now gptr=other_ptr
+   .						old_ptr is freed
+   .
+   .						other update, my_struct_alloc() returns old_ptr
+   .						successfully publish and de-publish
+   .						now gptr=old_ptr again
+   .
+   cmpxchg(&gptr, old_ptr, new_ptr)
+     cmpxchg() success, but the 2 updates
+     of CPU1 are completely missed.
+
+   This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
+   before cmpxchg() finished and to prevent ABA-problem.
+
+3) Beware NULL pointer.
+
+   Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
+
+lock-free update site(dispose, wait-free):
+{
+	old_ptr = xchg(&gptr, NULL);
+	if (old_ptr != NULL)
+		kfree_rcu(old_ptr);
+}
+
+   This code cause NULL reusing and may cause ABA-problem like above example:
+
+   CPU0						CPU1
+   rcu_read_lock()
+   old_ptr = rcu_dereference(gptr);
+   /* old_ptr = NULL */
+   my_struct_update(new_ptr, NULL);
+   .						successfully update, now gptr=other_ptr
+   .
+   .						successfully dispose
+   .						now gptr=NULL again
+   .
+   cmpxchg(&gptr, NULL, new_ptr)
+     cmpxchg() success, but the update
+     and the dispose of CPU1 are missed
+     consideration by CPU0.
+   rcu_read_unlock();
+
+   In many use cases, these behaviors are OK. In these use cases,
+   my_struct_update(new_ptr, NULL) give us the same result even we retry.
+
+   But in some raw use cases(I can't find any use-case now, I believe it exist),
+   the missed considerations of the updates are not acceptable, in this case,
+   we should use different null-value for NULL pointer for every disposing.
+
+lock-free update site(dispose, wait-free, paranoid version):
+{
+	null_ptr = alloc_null_ptr();
+	old_ptr = xchg(&gptr, null_ptr);
+	if (is_null_ptr(old_ptr))
+		free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
+	else
+		kfree_rcu(old_ptr);
+}
+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/