linux-kernel - Re: [ltt-dev] [RFC git tree] Userspace RCU (urcu) for Linux (repost)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090211085852.GA14973@Krystal>
Date:	Wed, 11 Feb 2009 03:58:52 -0500
From:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
To:	Lai Jiangshan <laijs@...fujitsu.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	ltt-dev@...ts.casi.polymtl.ca, linux-kernel@...r.kernel.org
Subject: Re: [ltt-dev] [RFC git tree] Userspace RCU (urcu) for Linux
	(repost)

* Lai Jiangshan (laijs@...fujitsu.com) wrote:
> Mathieu Desnoyers wrote:
> > 
> > I just did a mb() version of the urcu :
> > 
> > (uncomment CFLAGS=+-DDEBUG_FULL_MB in the Makefile)
> > 
> > Time per read : 48.4086 cycles
> > (about 6-7 times slower, as expected)
> > 
> 
> I had read many papers of Paul.
> (http://www.rdrop.com/users/paulmck/RCU/)
> and I know Paul did his endeavor to remove memory barrier in
> RCU read site in kernel. His work is of consequence.
> 
> But, I think,
> 1) Userspace RCU's read site can pay for the latency of
> memory barrier(include atomic operator).
>    Userspace does not access to shared data so frequently as kernel.
> and userspace's read site is not so fast as kernel.
> 
> 2) Userspace uses RCU is for RCU's excellence, not saving a little cpu cycles
>    (http://lwn.net/Articles/263130/)
>    One of the most important excellence is lock-free.
> 
> 
> If my thinking is right, the following opinion has some meaning too.
> 
> Use All-SYSTEM 's RCU for Userspace RCU.
> 
> All-SYSTEM 's RCU is QRCU which is implemented by Paul.
> http://lwn.net/Articles/223752/
> 
> Any system which has mechanisms equivalent to atomic_op,
> __wait_event, wake_up, mutex, This system can also implement QRCU.
> So most system can implement QRCU, and I say QRCU is All-SYSTEM 's RCU.
> 
> Obviously, we can implement a portable QRCU highly simply in NPTL.
> and read lock is:
> 	for (;;) {
> 		int idx = qp->completed & 0x1;
> 		if (likely(atomic_inc_not_zero(qp->ctr + idx)))
> 			return idx;
> 	}
> "atomic_inc_not_zero" is called once likely, it's fast enough.
> 

Hi Lai,

There are a few reasons why we need rcu in userspace for tracing :

- We need very fast per-cpu read-side synchronization for data structure
  handling. Updates are rare (enabling/disabling tracing). Therefore,
  your argument about userspace not needing "fast" rcu does not hold in
  this case. Note that LTTng has the performance it has today in the
  kernel because I made sure to use no memory barriers when unnecessary
  and because I used the minimal amount of atomic operations required.
  Those represent costly synchronization primitives on quite a few
  architectures.
- Being lock-free (atomic). To trace code executed in signal handlers,
  we need to be able to nest over any user code. With the solution you
  propose above, the busy-loop in the read-lock does not seems to be
  signal-safe : if it nests over a writer, it could busy-loop forever.

Mathieu

> Lai.
> 
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/