[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100113002344.GH13035@linux.vnet.ibm.com>
Date: Tue, 12 Jan 2010 16:23:44 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
akpm@...ux-foundation.org, josh@...htriplett.org,
tglx@...utronix.de, Valdis.Kletnieks@...edu, dhowells@...hat.com,
laijs@...fujitsu.com, dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
barrier (v3b)
On Tue, Jan 12, 2010 at 01:56:41PM -0500, Mathieu Desnoyers wrote:
> * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > On Tue, Jan 12, 2010 at 10:38:54AM -0500, Mathieu Desnoyers wrote:
> > > * Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> > > > On Sun, Jan 10, 2010 at 11:30:16PM -0500, Mathieu Desnoyers wrote:
> > > > > Here is an implementation of a new system call, sys_membarrier(), which
> > > > > executes a memory barrier on all threads of the current process.
> > > > >
> > > > > It aims at greatly simplifying and enhancing the current signal-based
> > > > > liburcu userspace RCU synchronize_rcu() implementation.
> > > > > (found at http://lttng.org/urcu)
> > > >
> > > > I didn't expect quite this comprehensive of an implementation from the
> > > > outset, but I guess I cannot complain. ;-)
> > > >
> > > > Overall, good stuff.
> > > >
> > > > Interestingly enough, what you have implemented is analogous to
> > > > synchronize_rcu_expedited() and friends that have recently been added
> > > > to the in-kernel RCU API. By this analogy, my earlier semi-suggestion
> > > > of synchronize_rcu(0 would be a candidate non-expedited implementation.
> > > > Long latency, but extremely low CPU consumption, full batching of
> > > > concurrent requests (even unrelated ones), and so on.
> > >
> > > Yes, the main different I think is that the sys_membarrier
> > > infrastructure focuses on IPI-ing only the current process running
> > > threads.
> >
> > Which does indeed make sense for the expedited interface. On the other
> > hand, if you have a bunch of concurrent non-expedited requests from
> > different processes, covering all CPUs efficiently satisfies all of
> > the requests in one go. And, if you use synchronize_sched() for the
> > non-expedited case, there will be no IPIs in the common case.
>
> So are you proposing we add a "int expedited" parameter to the
> system call, and let the caller choose between the ipi and
> synchronize_sched() schemes ?
Sure, why not?
> [...]
> > > > Also, is "top"
> > > > accurate given that the IPI handler will have interrupts disabled?
> > >
> > > Probably not. AFAIK. "top" does not really consider interrupts into its
> > > accounting. So, better take this top output with a grain of salt or two.
> >
> > Might need something like oprofile to get good info?
>
> Could be. Although I just wanted to point out the kind of pattern we
> should expect. I'm not convinced it's so useful to give the detailed
> oprofile info. I'm rephrasing the above paragraph to state that top is
> not super-accurate here.
K.
Thanx, Paul
> [...]
>
> Thanks,
>
> Mathieu
>
> --
> Mathieu Desnoyers
> OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists