[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0801300856330.29020@gandalf.stny.rr.com>
Date: Wed, 30 Jan 2008 09:09:12 -0500 (EST)
From: Steven Rostedt <rostedt@...dmis.org>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
cc: LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Hellwig <hch@...radead.org>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Gregory Haskins <ghaskins@...ell.com>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
Thomas Gleixner <tglx@...utronix.de>,
Tim Bird <tim.bird@...sony.com>,
Sam Ravnborg <sam@...nborg.org>,
"Frank Ch. Eigler" <fche@...hat.com>,
Jan Kiszka <jan.kiszka@...mens.com>,
John Stultz <johnstul@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arjan van de Ven <arjan@...radead.org>,
Steven Rostedt <srostedt@...hat.com>
Subject: Re: [PATCH 02/22 -v7] Add basic support for gcc profiler instrumentation
Paul,
Peter and I are having a discussion on craziness of archs and memory
barriers. You seem to understand crazy archs pretty well, and we would
like some advice. :-)
See below:
On Wed, 30 Jan 2008, Steven Rostedt wrote:
>
>
> On Wed, 30 Jan 2008, Peter Zijlstra wrote:
>
> >
> > On Tue, 2008-01-29 at 22:15 -0500, Steven Rostedt wrote:
> >
> > > +int register_mcount_function(struct mcount_ops *ops)
> > > +{
> > > + unsigned long flags;
> > > +
> > > + spin_lock_irqsave(&mcount_func_lock, flags);
> > > + ops->next = mcount_list;
> > > + /* must have next seen before we update the list pointer */
> > > + smp_wmb();
> >
> > That comment does not explain which race it closes; this is esp
> > important as there is no paired barrier to give hints.
>
> OK, fair enough. I'll explain it a bit more.
>
> How's this:
>
> /*
> * We are entering ops into the mcount_list but another
> * CPU might be walking that list. We need to make sure
> * the ops->next pointer is valid before another CPU sees
> * the ops pointer included into the mcount_list.
> */
>
The above is my new comment. But Peter says that it's still not good
enough and that all write memory barriers need read barriers. Let me
explain the situation here.
We have a single link list called mcount_list that is walked when more
than one function is registered by mcount. Mcount is called at the start
of all C functions that are not annotated with "notrace". When more than
one function is registered, mcount calls a loop function that does the
following:
notrace void mcount_list_func(unsigned long ip, unsigned long parent_ip)
{
struct mcount_ops *op = mcount_list;
while (op != &mcount_list_end) {
op->func(ip, parent_ip);
op = op->next;
};
}
A registered function must already have a "func" filled, and the mcount
register code takes care of "next". It is documented that the calling
function should "never" change next and always expect that the func can be
called after it is unregistered. That's not the issue here.
The issue is how to insert the ops into the list. I've done the following,
as you can see in the code this text is inserted between.
ops->next = mcount_list;
smp_wmb();
mcount_list = ops;
The read side pair is the reading of ops to ops->next, which should imply
a smp_rmb() just by the logic. But Peter tells me things like alpha is
crazy enough to do better than that! Thus, I'm asking you.
Can some arch have a reader where it receives ops->next before it received
ops? This seems to me to be a phsyic arch, to know where ops->next is
before it knows ops!
Remember, that the ops that is being registered, is not viewable by any
other CPU until mcount_list = ops. I don't see the need for a read barrier
in this case. But I could very well be wrong.
Help!
-- Steve
>
> >
> > > + mcount_list = ops;
> > > + /*
> > > + * For one func, simply call it directly.
> > > + * For more than one func, call the chain.
> > > + */
> > > + if (ops->next == &mcount_list_end)
> > > + mcount_trace_function = ops->func;
> > > + else
> > > + mcount_trace_function = mcount_list_func;
> > > + spin_unlock_irqrestore(&mcount_func_lock, flags);
> > > +
> > > + return 0;
> > > +}
> >
> >
> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists