[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <52FBBE19-660F-4A91-B9DF-2949655AD2C7@163.com>
Date: Wed, 1 Apr 2015 22:23:36 +0800
From: pi3orama <pi3orama@....com>
To: Wang Nan <wangnan0@...wei.com>
Cc: "<acme@...nel.org>" <acme@...nel.org>,
"<jolsa@...nel.org>" <jolsa@...nel.org>,
"<namhyung@...nel.org>" <namhyung@...nel.org>,
"<mingo@...hat.com>" <mingo@...hat.com>,
"<lizefan@...wei.com>" <lizefan@...wei.com>,
"<linux-kernel@...r.kernel.org>" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/4] perf tools: report: introduce --map-adjustment argument.
发自我的 iPhone
> 在 2015年4月1日,下午6:33,Wang Nan <wangnan0@...wei.com> 写道:
>
> This patch introduces a --map-adjustment argument for perf report. The
> goal of this option is to deal with private dynamic loader used in some
> special program.
>
> Some programs write their private dynamic loader instead of glibc ld for
> different reasons. They mmap() executable memory area, assemble code
> from different '.so' and '.o' files then do the relocation and code
> fixing by itself. The memory area is not file-backended so perf is
> unable to handle symbol information in those files.
>
> This patch allows user to give perf report hints directly using
> '--map-adjustment' argument. Perf report will regard such mapping as
> file-backended mapping and treat them as dso instead of private mapping
> area.
>
> The main part of this patch resides in util/machine.c. struct map_adj is
> introduced to represent each adjustment. They are sorted and linked
> together to map_adj_list linked list. When a real MMAP event raises,
> perf checks such adjustments before calling map__new() and
> thread__insert_map(), then setup filename and pgoff according to user
> hints. It also splits MMAP events when necessary.
>
> Usage of --map-adjustment is appended into Documentation/perf-report.txt.
>
> Here is an example:
>
> $ perf report --map-adjustment=./libtest.so@...fa52fcb1000,0x4000,0x21000,92051 \
> --no-children
>
> Where 0x7fa52fcb1000 is private map area got through:
>
> mmap(NULL, 4096 * 4, PROT_EXEC|PROT_WRITE|PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE,
> -1, 0);
>
> And its contents are copied from libtest.so.
>
> Signed-off-by: Wang Nan <wangnan0@...wei.com>
> ---
> tools/perf/Documentation/perf-report.txt | 11 ++
> tools/perf/builtin-report.c | 2 +
> tools/perf/util/machine.c | 276 ++++++++++++++++++++++++++++++-
> tools/perf/util/machine.h | 2 +
> 4 files changed, 288 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 4879cf6..e19349c 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -323,6 +323,17 @@ OPTIONS
> --header-only::
> Show only perf.data header (forces --stdio).
>
> +--map-adjustment=objfile@...rt,length[,pgoff[,pid]]::
> + Give memory layout hints for specific or all process. This makes
> + perf regard provided range of memory as mapped from provided
> + file instead of its original attributes found in perf.data.
> + start and length should be hexadecimal values represent the
> + address range. pgoff should be hexadecimal values represent
> + mapping offset (in pages) of that file. Default pgoff value is
> + 0 (map from start of the file). If pid is ommited, such
> + adjustment will be applied to all process in this trace. This
> + should be used when perf.data contains only 1 process.
> +
> SEE ALSO
> --------
> linkperf:perf-stat[1], linkperf:perf-annotate[1]
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index b5b2ad4..9fdfb05 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -717,6 +717,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> "Don't show entries under that percent", parse_percent_limit),
> OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
> "how to display percentage of filtered entries", parse_filter_percentage),
> + OPT_CALLBACK(0, "map-adjustment", NULL, "objfile@...rt,length[,pgoff[,pid]]",
> + "Provide map adjustment hinting", parse_map_adjustment),
> OPT_END()
> };
> struct perf_data_file file = {
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 051883a..dc9e91e 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -1155,21 +1155,291 @@ out_problem:
> return -1;
> }
>
> +/*
> + * Users are allowed to provide map adjustment setting for the case
> + * that an address range is actually privatly mapped but known to be
> + * ELF object file backended. Like this:
> + *
> + * |<- copied from libx.so ->| |<- copied from liby.so ->|
> + * |<-------------------- MMAP area --------------------->|
> + *
> + * When dealing with such mmap events, try to obey user adjustment.
> + * Such adjustment settings are not allowed overlapping.
> + * Adjustments won't be considered as valid code until real MMAP events
> + * take place. Therefore, users are allowed to provide adjustments which
> + * cover never mapped areas, like:
> + *
> + * |<- libx.so ->| |<- liby.so ->|
> + * |<-- MMAP area -->|
> + *
> + * This feature is useful when dealing with private dynamic linkers,
> + * which assemble code piece from different ELF objects.
> + *
> + * map_adj_list is an ordered linked list. Order of two adjustments is
> + * first defined by their pid, and then by their start address.
> + * Therefore, adjustments for specific pids are groupped together
> + * naturally.
> + */
> +static LIST_HEAD(map_adj_list);
> +struct map_adj {
> + u32 pid;
> + u64 start;
> + u64 len;
> + u64 pgoff;
> + struct list_head list;
> + char filename[PATH_MAX];
> +};
> +
> +enum map_adj_cross {
> + MAP_ADJ_LEFT_PID,
> + MAP_ADJ_LEFT,
> + MAP_ADJ_CROSS,
> + MAP_ADJ_RIGHT,
> + MAP_ADJ_RIGHT_PID,
> +};
> +
> +/*
> + * Check whether two map_adj cross over each other. This function is
> + * used for comparing adjustments. For overlapping adjustments, it
> + * reports different between two start address and the length of
> + * overlapping area. Signess of pgoff_diff can be used to determine
> + * which one is the left one.
> + *
> + * If anyone in r and l has pid set as -1, don't consider pid.
> + */
> +static enum map_adj_cross
> +check_map_adj_cross(struct map_adj* l, struct map_adj* r,
> + int *pgoff_diff, u64 *cross_len)
> +{
> + bool swapped = false;
> +
> + if ((l->pid != (u32)(-1)) && (r->pid != (u32)(-1))
> + && (l->pid != r->pid))
> + return (l->pid < r->pid) ? MAP_ADJ_LEFT_PID : MAP_ADJ_RIGHT_PID;
> +
> + if (l->start > r->start) {
> + struct map_adj *t = l;
> + swapped = true;
> + l = r;
> + r = t;
> + }
> +
> + if (l->start + l->len > r->start) {
> + if (pgoff_diff)
> + *pgoff_diff = ((r->start - l->start) / page_size) *
> + (swapped ? -1 : 1);
> + if (cross_len) {
> + u64 cross_start = r->start;
> + u64 l_end = l->start + l->len;
> + u64 r_end = r->start + r->len;
> +
> + *cross_len = (l_end < r_end ? l_end : r_end) -
> + cross_start;
> + }
> + return MAP_ADJ_CROSS;
> + }
> +
> + return swapped ? MAP_ADJ_RIGHT : MAP_ADJ_LEFT;
> +}
> +
> +static int machine_add_map_adj(u32 pid, u64 start, u64 len,
> + u64 pgoff, const char *filename)
> +{
> + struct map_adj *pos;
> + struct map_adj *new;
> + struct map_adj tmp = {
> + .pid = pid,
> + .start = start,
> + .len = len,
> + };
> +
> + if (!filename)
> + return -EINVAL;
> +
> + if ((start % page_size) || (len % page_size)) {
> + pr_err("Map adjustment is not page aligned for %d%s.\n", pid,
> + pid == (u32)(-1) ? " (all pids)" : "");
> + return -EINVAL;
> + }
> +
> + if ((pid != (u32)(-1)) && (!list_empty(&map_adj_list))) {
> + /*
> + * Don't allow mixing (u32)(-1) (for all pids) and
> + * normal pid.
> + *
> + * During sorting, (u32)(-1) should be considered as
> + * the largest pid.
> + */
> + struct map_adj *largest = list_entry(map_adj_list.prev,
> + struct map_adj, list);
> +
> + if (largest->pid == (u32)(-1)) {
> + pr_err("Providing both system-wide and pid specific map adjustments is forbidden.\n");
> + return -EINVAL;
> + }
> + }
> +
> + /*
> + * Find the first one which is larger than tmp and insert new
> + * adj prior to it.
> + */
> + list_for_each_entry(pos, &map_adj_list, list) {
> + enum map_adj_cross cross;
> +
> + cross = check_map_adj_cross(&tmp, pos, NULL, NULL);
> + if (cross < MAP_ADJ_CROSS)
> + break;
> + if (cross == MAP_ADJ_CROSS) {
> + pr_err("Overlapping map adjustments provided for pid %d%s\n", pid,
> + pid == (u32)(-1) ? " (all pids)" : "");
> + return -EINVAL;
> + }
> + }
> +
> + new = malloc(sizeof(*new));
> + if (!new)
> + return -EINVAL;
> +
> + new->pid = pid;
> + new->start = start;
> + new->len = len;
> + new->pgoff = pgoff;
> + strncpy(new->filename, filename, PATH_MAX);
> + list_add(&new->list, pos->list.prev);
> + return 0;
> +}
> +
> static int machine_map_new(struct machine *machine, u64 start, u64 len,
> u64 pgoff, u32 pid, u32 d_maj, u32 d_min, u64 ino,
> u64 ino_gen, u32 prot, u32 flags, char *filename,
> enum map_type type, struct thread *thread)
> {
> + struct map_adj *pos;
> struct map *map;
>
> - map = map__new(machine, start, len, pgoff, pid, d_maj, d_min,
> - ino, ino_gen, prot, flags, filename, type, thread);
> + list_for_each_entry(pos, &map_adj_list, list) {
> + u64 adj_start, adj_len, adj_pgoff, cross_len;
> + enum map_adj_cross cross;
> + struct map_adj tmp;
> + int pgoff_diff;
> +
> +again:
> + if (len == 0)
> + break;
> +
> + tmp.pid = pid;
> + tmp.start = start;
> + tmp.len = len;
> +
> + cross = check_map_adj_cross(&tmp,
> + pos, &pgoff_diff, &cross_len);
> +
> + if (cross < MAP_ADJ_CROSS)
> + break;
> + if (cross > MAP_ADJ_CROSS)
> + continue;
> +
> + if (pgoff_diff <= 0) {
> + /*
> + * |<----- tmp ----->|
> + * |<----- pos ----->|
> + */
> +
> + adj_start = tmp.start;
> + adj_len = cross_len;
> + adj_pgoff = pos->pgoff + (-pgoff_diff);
> + map = map__new(machine, adj_start, adj_len, adj_pgoff,
> + pid, 0, 0, 0, 0, prot, flags,
> + pos->filename, type, thread);
> + } else {
> + /*
> + * |<----- tmp ----->|
> + * |<-- X -->|<----- pos ----->|
> + * In this case, only deal with tmp part X. goto again
> + * instead of next pos.
> + */
> + adj_start = tmp.start;
> + adj_len = tmp.len - cross_len;
> + adj_pgoff = tmp.pgoff;
> + map = map__new(machine, adj_start, adj_len, adj_pgoff,
> + pid, d_maj, d_min, ino, ino_gen, prot,
> + flags, filename, type, thread);
> +
> + }
> +
> + if (map == NULL)
> + goto error;
> +
> + thread__insert_map(thread, map);
> +
> + pgoff += adj_len / page_size;
> + start = tmp.start + adj_len;
> + len -= adj_len;
> + if (pgoff_diff > 0)
> + goto again;
> + }
> +
> + map = map__new(machine, start, len, pgoff,
> + pid, d_maj, d_min, ino, ino_gen, prot,
> + flags, filename, type, thread);
We'd better check the value of len, and only do this mapping if len is not 0.
> if (map == NULL)
> - return -1;
> + goto error;
>
> thread__insert_map(thread, map);
> +
> return 0;
> +error:
> + return -1;
> +}
> +
> +int parse_map_adjustment(const struct option *opt __maybe_unused,
> + const char *arg, int unset __maybe_unused)
> +{
> + const char *ptr;
> + char *sep;
> + int err;
> + u64 start, len, pgoff = 0;
> + u32 pid = (u32)(-1);
> + char filename[PATH_MAX];
> +
> + sep = strchr(arg, '@');
> + if (sep == NULL)
> + goto err;
> +
> + strncpy(filename, arg, sep - arg);
> +
> + ptr = sep + 1; /* Skip '@' */
> +
> + /* start */
> + start = strtoll(ptr, &sep, 16);
> + if (*sep != ',')
> + goto err;
> + ptr = sep + 1;
> +
> + /* len */
> + len = strtoll(ptr, &sep, 16);
> + if (*sep == ',') {
> + /* pgoff */
> + ptr = sep + 1;
> + pgoff = strtoll(ptr, &sep, 16);
> +
> + if (*sep == ',') {
> + /* pid */
> + ptr = sep + 1;
> + pid = strtol(ptr, &sep, 10);
> + }
> + }
> +
> + if (*sep != '\0')
> + goto err;
> +
> + err = machine_add_map_adj(pid, start, len, pgoff, filename);
> + return err;
> +
> +err:
> + fprintf(stderr, "invalid map adjustment setting: %s\n", arg);
> + return -1;
> }
>
> int machine__process_mmap2_event(struct machine *machine,
> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> index e2faf3b..73b49e4 100644
> --- a/tools/perf/util/machine.h
> +++ b/tools/perf/util/machine.h
> @@ -223,4 +223,6 @@ pid_t machine__get_current_tid(struct machine *machine, int cpu);
> int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid,
> pid_t tid);
>
> +int parse_map_adjustment(const struct option *opt, const char *arg, int unset);
> +
> #endif /* __PERF_MACHINE_H */
> --
> 1.8.3.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists