[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1459831974-2891931-2-git-send-email-ast@fb.com>
Date: Mon, 4 Apr 2016 21:52:47 -0700
From: Alexei Starovoitov <ast@...com>
To: Steven Rostedt <rostedt@...dmis.org>
CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"David S . Miller" <davem@...emloft.net>,
Ingo Molnar <mingo@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Wang Nan <wangnan0@...wei.com>, Josef Bacik <jbacik@...com>,
Brendan Gregg <brendan.d.gregg@...il.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<kernel-team@...com>
Subject: [PATCH net-next 1/8] perf: optimize perf_fetch_caller_regs
avoid memset in perf_fetch_caller_regs, since it's the critical path of all tracepoints.
It's called from perf_sw_event_sched, perf_event_task_sched_in and all of perf_trace_##call
with this_cpu_ptr(&__perf_regs[..]) which are zero initialized by perpcu_alloc and
subsequent call to perf_arch_fetch_caller_regs initializes the same fields on all archs,
so we can safely drop memset from all of the above cases and move it into
perf_ftrace_function_call that calls it with stack allocated pt_regs.
Signed-off-by: Alexei Starovoitov <ast@...nel.org>
---
include/linux/perf_event.h | 2 --
kernel/trace/trace_event_perf.c | 1 +
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index f291275ffd71..e89f7199c223 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -882,8 +882,6 @@ static inline void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned lo
*/
static inline void perf_fetch_caller_regs(struct pt_regs *regs)
{
- memset(regs, 0, sizeof(*regs));
-
perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
}
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 00df25fd86ef..7a68afca8249 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -316,6 +316,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
BUILD_BUG_ON(ENTRY_SIZE > PERF_MAX_TRACE_SIZE);
+ memset(®s, 0, sizeof(regs));
perf_fetch_caller_regs(®s);
entry = perf_trace_buf_prepare(ENTRY_SIZE, TRACE_FN, NULL, &rctx);
--
2.8.0
Powered by blists - more mailing lists