[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180501152135.GJ26088@linux.vnet.ibm.com>
Date: Tue, 1 May 2018 08:21:35 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Joel Fernandes <joelaf@...gle.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Tom Zanussi <tom.zanussi@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Boqun Feng <boqun.feng@...il.com>,
"Cc: Frederic Weisbecker" <fweisbec@...il.com>,
Randy Dunlap <rdunlap@...radead.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Fenguang Wu <fengguang.wu@...el.com>,
Baohong Liu <baohong.liu@...el.com>,
Vedang Patel <vedang.patel@...el.com>,
"Cc: Android Kernel" <kernel-team@...roid.com>
Subject: Re: [PATCH RFC v5 5/6] tracepoint: Make rcuidle tracepoint callers
use SRCU
On Tue, May 01, 2018 at 03:16:02PM +0000, Joel Fernandes wrote:
> On Tue, May 1, 2018 at 7:34 AM Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> wrote:
>
> > On Tue, May 01, 2018 at 10:24:01AM -0400, Steven Rostedt wrote:
> > > On Mon, 30 Apr 2018 18:42:03 -0700
> > > Joel Fernandes <joelaf@...gle.com> wrote:
> > >
> > > > In recent tests with IRQ on/off tracepoints, a large performance
> > > > overhead ~10% is noticed when running hackbench. This is root caused
> to
> > > > calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
> > > > tracepoint code. Following a long discussion on the list [1] about
> this,
> > > > we concluded that srcu is a better alternative for use during rcu
> idle.
> > > > Although it does involve extra barriers, its lighter than the
> sched-rcu
> > > > version which has to do additional RCU calls to notify RCU idle about
> > > > entry into RCU sections.
> > > >
> > > > In this patch, we change the underlying implementation of the
> > > > trace_*_rcuidle API to use SRCU. This has shown to improve performance
> > > > alot for the high frequency irq enable/disable tracepoints.
>
> > [ . . . ]
>
> > > > --- a/kernel/tracepoint.c
> > > > +++ b/kernel/tracepoint.c
> > > > @@ -31,6 +31,9 @@
> > > > extern struct tracepoint * const __start___tracepoints_ptrs[];
> > > > extern struct tracepoint * const __stop___tracepoints_ptrs[];
> > > >
> > > > +DEFINE_SRCU(tracepoint_srcu);
> > > > +EXPORT_SYMBOL_GPL(tracepoint_srcu);
> > > > +
> > > > /* Set to 1 to enable tracepoint debug output */
> > > > static const int tracepoint_debug;
> > > >
> > > > @@ -67,11 +70,16 @@ static inline void *allocate_probes(int count)
> > > > return p == NULL ? NULL : p->probes;
> > > > }
> > > >
> > > > -static void rcu_free_old_probes(struct rcu_head *head)
> > > > +static void srcu_free_old_probes(struct rcu_head *head)
> > > > {
> > > > kfree(container_of(head, struct tp_probes, rcu));
> > > > }
> > > >
> > > > +static void rcu_free_old_probes(struct rcu_head *head)
> > > > +{
> > > > + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes);
> > >
> > > Hmm, is it OK to call call_srcu() from a call_rcu() callback? I guess
> > > it would be.
>
> > It is perfectly legal, and quite a bit simpler than setting something
> > up to wait for both to complete concurrently.
>
> Cool. Also in this case if we call both in sequence, then I felt there
> could be a race to free the old data since both callbacks would try to do
> the same thing. The same thing being freeing of the same set of old probes
> which would need some synchronization between the 2 callbacks. With the
> chaining, since the ordering is assured there wouldn't be a question of
> such a race. I could add this reasoning to the changelog as well.
Actually, as long as you have a solid happens-before between both of the
callbacks and the freeing, you are in good shape. A release-acquire would
work fine, as would a lock acquired in both callbacks and then acquired
(and possibly released) before the free.
Thanx, Paul
Powered by blists - more mailing lists