[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1255468214.7113.2396.camel@gandalf.stny.rr.com>
Date: Tue, 13 Oct 2009 17:10:14 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH 1/5] [PATCH 1/5] function-graph/x86: replace unbalanced
ret with jmp
On Tue, 2009-10-13 at 16:47 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@...dmis.org) wrote:
> > From: Steven Rostedt <srostedt@...hat.com>
> >
> > The function graph tracer replaces the return address with a hook to
> > trace the exit of the function call. This hook will finish by returning
> > to the real location the function should return to.
> >
> > But the current implementation uses a ret to jump to the real return
> > location. This causes a imbalance between calls and ret. That is
> > the original function does a call, the ret goes to the handler
> > and then the handler does a ret without a matching call.
> >
> > Although the function graph tracer itself still breaks the branch
> > predictor by replacing the original ret, by using a second ret and
> > causing an imbalance, it breaks the predictor even more.
> >
> > This patch replaces the ret with a jmp to keep the calls and ret
> > balanced. I tested this on one box and it showed a 1.7% increase in
> > performance. Another box only showed a small 0.3% increase. But no
> > box that I tested this on showed a decrease in performance by making this
> > change.
>
> This sounds exactly like what I proposed at LPC. I'm glad it shows
> actual improvements.
This is what we discussed at LPC. We both were under the assumption that
a jump would work. The question was how to make that jump without hosing
registers.
We lucked out that this is the back end of the return sequence. Where we
can still clobber callie registers. (just not the ones holding the
return code).
>
> Just to make sure I understand, the old sequence was:
>
> call fct
> call ftrace_entry
> ret to fct
> ret to ftrace_exit
> ret to caller
>
> and you now have:
>
> call fct
> call ftrace_entry
> ret to fct
> ret to ftrace_exit
> jmp to caller
>
> Am I correct ?
Almost.
What it was:
call function
function:
call mcount
mcount:
call ftrace_entry
ftrace_entry:
mess up with return code of caller
ret
ret
[function code]
ret to ftrace_exit
ftrace_exit:
get real return
ret to original
So for the function we have 3 calls and 4 rets
Now we have:
What it was:
call function
function:
call mcount
mcount:
call ftrace_entry
ftrace_entry:
mess up with return code of caller
ret
ret
[function code]
ret to ftrace_exit
ftrace_exit:
get real return
jmp to original
Now we have 3 calls and 3 rets
Note the first call still does not match the ret, but we don't do two
rets anymore.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists