lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141217143105.GG18257@krava.brq.redhat.com>
Date:	Wed, 17 Dec 2014 15:31:05 +0100
From:	Jiri Olsa <jolsa@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	mingo@...nel.org, linux-kernel@...r.kernel.org,
	Frederic Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] perf: Avoid horrible stack usage

On Tue, Dec 16, 2014 at 12:50:41PM +0100, Peter Zijlstra wrote:
> Both Linus (most recent) and Steve (a while ago) reported that perf
> results in massive stack bloat.
> 
> The problem is that software events need a pt_regs in order to
> properly report the event location and unwind stack. And because we
> could not assume one was present we allocated one on stack and filled
> it with minimal bits required for operation.
> 
> Now, pt_regs is quite large, so this is undesirable. Furthermore it
> turns out that most sites actually have a pt_regs pointer available,
> making this even more onerous, as the stack space is pointless waste.
> 
> This patch addresses the problem by observing that software events
> have well defined nesting semantics, therefore we can use static
> per-cpu storage instead of on-stack.
> 
> Linus made the further observation that all but the scheduler callers
> of perf_sw_event() have a pt_regs available, so we change the regular
> perf_sw_event() to require a valid pt_regs (where it used to be
> optional) and add perf_sw_event_sched() for the scheduler.
> 
> We have a scheduler specific call instead of a more generic _noregs()
> like construct because we can assume non-recursion from the scheduler
> and thereby simplify the code further (_noregs would have to put the
> recursion context call inline in order to assertain which __perf_regs
> element to use).
> 
> One last note on the implementation of perf_trace_buf_prepare(); we
> allow .regs = NULL for those cases where we already have a pt_regs
> pointer available and do not need another.

looks like we could also sanitize the perf_ftrace_function_call
in the same way.. like below (untested)

hum, I just noticed it actually has pt_regs arg ;-)

Steven, the commit log of:
  a1e2e31d175a ftrace: Return pt_regs to function trace callback

says the pt_regs value is arch specific, but I could not find the
ARCH_SUPPORTS_FTRACE_SAVE_REGS define..

thanks,
jirka


---
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 6fa484de2ba1..6122c216e857 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -304,7 +304,7 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
 {
 	struct ftrace_entry *entry;
 	struct hlist_head *head;
-	struct pt_regs regs;
+	struct pt_regs *regs;
 	int rctx;
 
 	head = this_cpu_ptr(event_function.perf_events);
@@ -316,12 +316,12 @@ perf_ftrace_function_call(unsigned long ip, unsigned long parent_ip,
 
 	BUILD_BUG_ON(ENTRY_SIZE > PERF_MAX_TRACE_SIZE);
 
-	perf_fetch_caller_regs(&regs);
-
-	entry = perf_trace_buf_prepare(ENTRY_SIZE, TRACE_FN, NULL, &rctx);
+	entry = perf_trace_buf_prepare(ENTRY_SIZE, TRACE_FN, &regs, &rctx);
 	if (!entry)
 		return;
 
+	perf_fetch_caller_regs(&regs);
+
 	entry->ip = ip;
 	entry->parent_ip = parent_ip;
 	perf_trace_buf_submit(entry, ENTRY_SIZE, rctx, 0,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ