[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100112185641.GA26731@Krystal>
Date: Tue, 12 Jan 2010 13:56:41 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
akpm@...ux-foundation.org, josh@...htriplett.org,
tglx@...utronix.de, Valdis.Kletnieks@...edu, dhowells@...hat.com,
laijs@...fujitsu.com, dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
barrier (v3b)
* Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> On Tue, Jan 12, 2010 at 10:38:54AM -0500, Mathieu Desnoyers wrote:
> > * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > > On Sun, Jan 10, 2010 at 11:30:16PM -0500, Mathieu Desnoyers wrote:
> > > > Here is an implementation of a new system call, sys_membarrier(), which
> > > > executes a memory barrier on all threads of the current process.
> > > >
> > > > It aims at greatly simplifying and enhancing the current signal-based
> > > > liburcu userspace RCU synchronize_rcu() implementation.
> > > > (found at http://lttng.org/urcu)
> > >
> > > I didn't expect quite this comprehensive of an implementation from the
> > > outset, but I guess I cannot complain. ;-)
> > >
> > > Overall, good stuff.
> > >
> > > Interestingly enough, what you have implemented is analogous to
> > > synchronize_rcu_expedited() and friends that have recently been added
> > > to the in-kernel RCU API. By this analogy, my earlier semi-suggestion
> > > of synchronize_rcu(0 would be a candidate non-expedited implementation.
> > > Long latency, but extremely low CPU consumption, full batching of
> > > concurrent requests (even unrelated ones), and so on.
> >
> > Yes, the main different I think is that the sys_membarrier
> > infrastructure focuses on IPI-ing only the current process running
> > threads.
>
> Which does indeed make sense for the expedited interface. On the other
> hand, if you have a bunch of concurrent non-expedited requests from
> different processes, covering all CPUs efficiently satisfies all of
> the requests in one go. And, if you use synchronize_sched() for the
> non-expedited case, there will be no IPIs in the common case.
So are you proposing we add a "int expedited" parameter to the
system call, and let the caller choose between the ipi and
synchronize_sched() schemes ?
[...]
> > > Also, is "top"
> > > accurate given that the IPI handler will have interrupts disabled?
> >
> > Probably not. AFAIK. "top" does not really consider interrupts into its
> > accounting. So, better take this top output with a grain of salt or two.
>
> Might need something like oprofile to get good info?
Could be. Although I just wanted to point out the kind of pattern we
should expect. I'm not convinced it's so useful to give the detailed
oprofile info. I'm rephrasing the above paragraph to state that top is
not super-accurate here.
[...]
Thanks,
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists