lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1427884395-241111-4-git-send-email-wangnan0@huawei.com>
Date:	Wed, 1 Apr 2015 10:33:14 +0000
From:	Wang Nan <wangnan0@...wei.com>
To:	<acme@...nel.org>, <jolsa@...nel.org>, <namhyung@...nel.org>
CC:	<mingo@...hat.com>, <lizefan@...wei.com>, <pi3orama@....com>,
	<linux-kernel@...r.kernel.org>
Subject: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.

This patch introduces a --map-adjustment argument for perf report. The
goal of this option is to deal with private dynamic loader used in some
special program.

Some programs write their private dynamic loader instead of glibc ld for
different reasons. They mmap() executable memory area, assemble code
from different '.so' and '.o' files then do the relocation and code
fixing by itself. The memory area is not file-backended so perf is
unable to handle symbol information in those files.

This patch allows user to give perf report hints directly using
'--map-adjustment' argument. Perf report will regard such mapping as
file-backended mapping and treat them as dso instead of private mapping
area.

The main part of this patch resides in util/machine.c. struct map_adj is
introduced to represent each adjustment. They are sorted and linked
together to map_adj_list linked list. When a real MMAP event raises,
perf checks such adjustments before calling map__new() and
thread__insert_map(), then setup filename and pgoff according to user
hints. It also splits MMAP events when necessary.

Usage of --map-adjustment is appended into Documentation/perf-report.txt.

Here is an example:

 $ perf report --map-adjustment=./libtest.so@...fa52fcb1000,0x4000,0x21000,92051 \
		--no-children

Where 0x7fa52fcb1000 is private map area got through:

   mmap(NULL, 4096 * 4, PROT_EXEC|PROT_WRITE|PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE,
		   -1, 0);

And its contents are copied from libtest.so.

Signed-off-by: Wang Nan <wangnan0@...wei.com>
---
 tools/perf/Documentation/perf-report.txt |  11 ++
 tools/perf/builtin-report.c              |   2 +
 tools/perf/util/machine.c                | 276 ++++++++++++++++++++++++++++++-
 tools/perf/util/machine.h                |   2 +
 4 files changed, 288 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 4879cf6..e19349c 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -323,6 +323,17 @@ OPTIONS
 --header-only::
 	Show only perf.data header (forces --stdio).
 
+--map-adjustment=objfile@...rt,length[,pgoff[,pid]]::
+	Give memory layout hints for specific or all process. This makes
+	perf regard provided range of memory as mapped from provided
+	file instead of its original attributes found in perf.data.
+	start and length should be hexadecimal values represent the
+	address range. pgoff should be hexadecimal values represent
+	mapping offset (in pages) of that file. Default pgoff value is
+	0 (map from start of the file). If pid is ommited, such
+	adjustment will be applied to all process in this trace. This
+	should be used when perf.data contains only 1 process.
+
 SEE ALSO
 --------
 linkperf:perf-stat[1], linkperf:perf-annotate[1]
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index b5b2ad4..9fdfb05 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -717,6 +717,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
 		     "Don't show entries under that percent", parse_percent_limit),
 	OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
 		     "how to display percentage of filtered entries", parse_filter_percentage),
+	OPT_CALLBACK(0, "map-adjustment", NULL, "objfile@...rt,length[,pgoff[,pid]]",
+		     "Provide map adjustment hinting", parse_map_adjustment),
 	OPT_END()
 	};
 	struct perf_data_file file = {
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 051883a..dc9e91e 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1155,21 +1155,291 @@ out_problem:
 	return -1;
 }
 
+/*
+ * Users are allowed to provide map adjustment setting for the case
+ * that an address range is actually privatly mapped but known to be
+ * ELF object file backended. Like this:
+ *
+ * |<- copied from libx.so ->|  |<- copied from liby.so ->|
+ * |<-------------------- MMAP area --------------------->|
+ *
+ * When dealing with such mmap events, try to obey user adjustment.
+ * Such adjustment settings are not allowed overlapping.
+ * Adjustments won't be considered as valid code until real MMAP events
+ * take place. Therefore, users are allowed to provide adjustments which
+ * cover never mapped areas, like:
+ *
+ * |<- libx.so ->|  |<- liby.so ->|
+ *      |<-- MMAP area -->|
+ *
+ * This feature is useful when dealing with private dynamic linkers,
+ * which assemble code piece from different ELF objects.
+ *
+ * map_adj_list is an ordered linked list. Order of two adjustments is
+ * first defined by their pid, and then by their start address.
+ * Therefore, adjustments for specific pids are groupped together
+ * naturally.
+ */
+static LIST_HEAD(map_adj_list);
+struct map_adj {
+	u32 pid;
+	u64 start;
+	u64 len;
+	u64 pgoff;
+	struct list_head list;
+	char filename[PATH_MAX];
+};
+
+enum map_adj_cross {
+	MAP_ADJ_LEFT_PID,
+	MAP_ADJ_LEFT,
+	MAP_ADJ_CROSS,
+	MAP_ADJ_RIGHT,
+	MAP_ADJ_RIGHT_PID,
+};
+
+/*
+ * Check whether two map_adj cross over each other. This function is
+ * used for comparing adjustments. For overlapping adjustments, it
+ * reports different between two start address and the length of
+ * overlapping area. Signess of pgoff_diff can be used to determine
+ * which one is the left one.
+ *
+ * If anyone in r and l has pid set as -1, don't consider pid.
+ */
+static enum map_adj_cross
+check_map_adj_cross(struct map_adj* l, struct map_adj* r,
+		int *pgoff_diff, u64 *cross_len)
+{
+	bool swapped = false;
+
+	if ((l->pid != (u32)(-1)) && (r->pid != (u32)(-1))
+			&& (l->pid != r->pid))
+		return (l->pid < r->pid) ? MAP_ADJ_LEFT_PID : MAP_ADJ_RIGHT_PID;
+
+	if (l->start > r->start) {
+		struct map_adj *t = l;
+		swapped = true;
+		l = r;
+		r = t;
+	}
+
+	if (l->start + l->len > r->start) {
+		if (pgoff_diff)
+			*pgoff_diff = ((r->start - l->start) / page_size) *
+				(swapped ? -1 : 1);
+		if (cross_len) {
+			u64 cross_start = r->start;
+			u64 l_end = l->start + l->len;
+			u64 r_end = r->start + r->len;
+
+			*cross_len = (l_end < r_end ? l_end : r_end) -
+					cross_start;
+		}
+		return MAP_ADJ_CROSS;
+	}
+
+	return swapped ? MAP_ADJ_RIGHT : MAP_ADJ_LEFT;
+}
+
+static int machine_add_map_adj(u32 pid, u64 start, u64 len,
+		     u64 pgoff, const char *filename)
+{
+	struct map_adj *pos;
+	struct map_adj *new;
+	struct map_adj tmp = {
+		.pid = pid,
+		.start = start,
+		.len = len,
+	};
+
+	if (!filename)
+		return -EINVAL;
+
+	if ((start % page_size) || (len % page_size)) {
+		pr_err("Map adjustment is not page aligned for %d%s.\n", pid,
+				pid == (u32)(-1) ? " (all pids)" : "");
+		return -EINVAL;
+	}
+
+	if ((pid != (u32)(-1)) && (!list_empty(&map_adj_list))) {
+		/*
+		 * Don't allow mixing (u32)(-1) (for all pids) and
+		 * normal pid.
+		 *
+		 * During sorting, (u32)(-1) should be considered as
+		 * the largest pid.
+		 */
+		struct map_adj *largest = list_entry(map_adj_list.prev,
+				struct map_adj, list);
+
+		if (largest->pid == (u32)(-1)) {
+			pr_err("Providing both system-wide and pid specific map adjustments is forbidden.\n");
+			return -EINVAL;
+		}
+	}
+
+	/*
+	 * Find the first one which is larger than tmp and insert new
+	 * adj prior to it.
+	 */
+	list_for_each_entry(pos, &map_adj_list, list) {
+		enum map_adj_cross cross;
+
+		cross = check_map_adj_cross(&tmp, pos, NULL, NULL);
+		if (cross < MAP_ADJ_CROSS)
+			break;
+		if (cross == MAP_ADJ_CROSS) {
+			pr_err("Overlapping map adjustments provided for pid %d%s\n", pid,
+					pid == (u32)(-1) ? " (all pids)" : "");
+			return -EINVAL;
+		}
+	}
+
+	new = malloc(sizeof(*new));
+	if (!new)
+		return -EINVAL;
+
+	new->pid = pid;
+	new->start = start;
+	new->len = len;
+	new->pgoff = pgoff;
+	strncpy(new->filename, filename, PATH_MAX);
+	list_add(&new->list, pos->list.prev);
+	return 0;
+}
+
 static int machine_map_new(struct machine *machine, u64 start, u64 len,
 		     u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
 		     u64 ino_gen, u32 prot, u32 flags, char *filename,
 		     enum map_type type, struct thread *thread)
 {
+	struct map_adj *pos;
 	struct map *map;
 
-	map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
-			ino, ino_gen, prot, flags, filename, type, thread);
+	list_for_each_entry(pos, &map_adj_list, list) {
+		u64 adj_start, adj_len, adj_pgoff, cross_len;
+		enum map_adj_cross cross;
+		struct map_adj tmp;
+		int pgoff_diff;
+
+again:
+		if (len == 0)
+			break;
+
+		tmp.pid = pid;
+		tmp.start = start;
+		tmp.len = len;
+
+		cross = check_map_adj_cross(&tmp,
+				pos, &pgoff_diff, &cross_len);
+
+		if (cross < MAP_ADJ_CROSS)
+			break;
+		if (cross > MAP_ADJ_CROSS)
+			continue;
+
+		if (pgoff_diff <= 0) {
+			/*
+			 *       |<----- tmp ----->|
+			 * |<----- pos ----->|
+			 */
+
+			adj_start = tmp.start;
+			adj_len = cross_len;
+			adj_pgoff = pos->pgoff + (-pgoff_diff);
+			map = map__new(machine, adj_start, adj_len, adj_pgoff,
+					pid, 0, 0, 0, 0, prot, flags,
+					pos->filename, type, thread);
+		} else {
+			/*
+			 * |<----- tmp ----->|
+			 * |<-- X -->|<----- pos ----->|
+			 * In this case, only deal with tmp part X. goto again
+			 * instead of next pos.
+			 */
+			adj_start = tmp.start;
+			adj_len = tmp.len - cross_len;
+			adj_pgoff = tmp.pgoff;
+			map = map__new(machine, adj_start, adj_len, adj_pgoff,
+					pid, d_maj, d_min, ino, ino_gen, prot,
+					flags, filename, type, thread);
+
+		}
+
+		if (map == NULL)
+			goto error;
+
+		thread__insert_map(thread, map);
+
+		pgoff += adj_len / page_size;
+		start = tmp.start + adj_len;
+		len -= adj_len;
+		if (pgoff_diff > 0)
+			goto again;
+	}
+
+	map = map__new(machine, start, len, pgoff,
+			pid, d_maj, d_min, ino, ino_gen, prot,
+			flags, filename, type, thread);
 
 	if (map == NULL)
-		return -1;
+		goto error;
 
 	thread__insert_map(thread, map);
+
 	return 0;
+error:
+	return -1;
+}
+
+int parse_map_adjustment(const struct option *opt __maybe_unused,
+		const char *arg, int unset __maybe_unused)
+{
+	const char *ptr;
+	char *sep;
+	int err;
+	u64 start, len, pgoff = 0;
+	u32 pid = (u32)(-1);
+	char filename[PATH_MAX];
+
+	sep = strchr(arg, '@');
+	if (sep == NULL)
+		goto err;
+
+	strncpy(filename, arg, sep - arg);
+
+	ptr = sep + 1; /* Skip '@' */
+
+	/* start */
+	start = strtoll(ptr, &sep, 16);
+	if (*sep != ',')
+		goto err;
+	ptr = sep + 1;
+
+	/* len */
+	len = strtoll(ptr, &sep, 16);
+	if (*sep == ',') {
+		/* pgoff */
+		ptr = sep + 1;
+		pgoff = strtoll(ptr, &sep, 16);
+
+		if (*sep == ',') {
+			/* pid */
+			ptr = sep + 1;
+			pid = strtol(ptr, &sep, 10);
+		}
+	}
+
+	if (*sep != '\0')
+		goto err;
+
+	err = machine_add_map_adj(pid, start, len, pgoff, filename);
+	return err;
+
+err:
+	fprintf(stderr, "invalid map adjustment setting: %s\n", arg);
+	return -1;
 }
 
 int machine__process_mmap2_event(struct machine *machine,
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e2faf3b..73b49e4 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -223,4 +223,6 @@ pid_t machine__get_current_tid(struct machine *machine, int cpu);
 int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
 			     pid_t tid);
 
+int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
+
 #endif /* __PERF_MACHINE_H */
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ