lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 16 Mar 2017 09:35:17 -0700
From:   tip-bot for Stephane Eranian <tipbot@...or.com>
To:     linux-tip-commits@...r.kernel.org
Cc:     hpa@...or.com, luto@...nel.org, mingo@...nel.org, jolsa@...hat.com,
        linux-kernel@...r.kernel.org, namhyung@...nel.org,
        tglx@...utronix.de, peterz@...radead.org, acme@...hat.com,
        eranian@...gle.com
Subject: [tip:perf/core] perf tools: Make
 perf_event__synthesize_mmap_events() scale

Commit-ID:  88b897a30c525c2eee6e7f16e1e8d0f18830845e
Gitweb:     http://git.kernel.org/tip/88b897a30c525c2eee6e7f16e1e8d0f18830845e
Author:     Stephane Eranian <eranian@...gle.com>
AuthorDate: Wed, 15 Mar 2017 10:17:13 -0700
Committer:  Arnaldo Carvalho de Melo <acme@...hat.com>
CommitDate: Wed, 15 Mar 2017 17:48:37 -0300

perf tools: Make perf_event__synthesize_mmap_events() scale

This patch significantly improves the execution time of
perf_event__synthesize_mmap_events() when running perf record on systems
where processes have lots of threads.

It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to
generate each map line in the maps file.  If you have 1000 threads, then you
have necessarily 1000 stacks.  For each vma, you need to check if it
corresponds to a thread's stack.  With a large number of threads, this can take
a very long time. I have seen latencies >> 10mn.

As of today, perf does not use the fact that a mapping is a stack, therefore we
can work around the issue by using /proc/pid/tasks/pid/maps.  This entry does
not try to map a vma to stack and is thus much faster with no loss of
functonality.

The proc-map-timeout logic is kept in case users still want some upper limit.

In V2, we fix the file path from /proc/pid/tasks/pid/maps to actual
/proc/pid/task/pid/maps, tasks -> task.  Thanks Arnaldo for catching this.

Committer note:

This problem seems to have been elliminated in the kernel since commit :
b18cb64ead40 ("fs/proc: Stop trying to report thread stacks").

Signed-off-by: Stephane Eranian <eranian@...gle.com>
Acked-by: Jiri Olsa <jolsa@...hat.com>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Link: http://lkml.kernel.org/r/20170315135059.GC2177@redhat.com
Link: http://lkml.kernel.org/r/1489598233-25586-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>
---
 tools/perf/util/event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index d082cb7..33fc2e9 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -325,8 +325,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
 	if (machine__is_default_guest(machine))
 		return 0;
 
-	snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
-		 machine->root_dir, pid);
+	snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps",
+		 machine->root_dir, pid, pid);
 
 	fp = fopen(filename, "r");
 	if (fp == NULL) {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ