[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100310041658.GA11667@Krystal>
Date: Tue, 9 Mar 2010 23:16:58 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Nick Piggin <npiggin@...e.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org,
josh@...htriplett.org, dvhltc@...ibm.com, niv@...ibm.com,
tglx@...utronix.de, peterz@...radead.org, Valdis.Kletnieks@...edu,
dhowells@...hat.com, linux-kernel@...r.kernel.org,
Chris Friesen <cfriesen@...tel.com>,
Fr??d??ric Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH -tip] introduce sys_membarrier(): process-wide memory
barrier (v9)
* Nick Piggin (npiggin@...e.de) wrote:
[...]
> The library is librcu, which I suspect will become quite important for
> parallel programming in future (maybe I hope for too much).
>
> But maybe it's better to not merge _any_ librcu special case until
> we see results from programs using it. More general speedups or features
> (that also help librcu) is a different story.
>
Hi Nick,
So, about the current state of liburcu and its users:
It is currently packaged in Debian, Ubuntu, Gentoo, and it is also being
packaged for Fedora. It is already being used by a few programs/libraries, and
given it's wide availability, we can expect more in a near future.
The first user of this library is the UST (Userspace Tracing) library; a port of
LTTng to a userspace.
http://lttng.org/ust
Modulo a few changes to port it to userspace, the kernel and user-space LTTng
should be expected to have similar performance, because they use essentially the
same lockless buffering scheme, described in chapter 5 of my thesis:
http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12.pdf
Here is the impact of two additional memory barriers on the LTTng tracer fast
path:
Intel Core Xeon 2.0 GHz
LTTng probe writing 16-byte worth of data to the trace (+4 byte event header)
(execution of 200000 loops, therefore trace buffers are cache-hot)
119 ns per event
adding 2 memory barriers, one before and one after the tracepoint:
155 ns per event
So we have a 25% slowdown on the tracer fast path, which is quite significant
when it comes to trace heavy workloads. The slowdown ratio may change slightly
for non cache-hot scenarios, but I expect it to stay in the same range. Section
8.4 of my thesis discusses the overhead of cache-cold buffers (around 333 ns per
event rather than 119 ns). I expect the cost of the memory barriers to increase
too in a cache-cold scenario.
If you want to have an insight on the class of applications that can be improved
with the userspace RCU library, you can have a look at Section 6.3 "User-Space
RCU Usage Scenarios" of my dissertation.
If you still wonder "who is using/contributing to LTTng ?", see section 9.2 of
my thesis. Or here is a quick list, taken from our website:
Google, IBM, Ericsson, Autodesk, Wind River, Fujitsu, Monta Vista, ST
Microelectronics, C2 Microsystems, Sony, Siemens, Nokia, Defence Research and
Development Canada.
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists