lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1437999699-19632-6-git-send-email-kan.liang@intel.com>
Date:	Mon, 27 Jul 2015 08:21:38 -0400
From:	Kan Liang <kan.liang@...el.com>
To:	acme@...nel.org, jolsa@...nel.org
Cc:	namhyung@...nel.org, ak@...ux.intel.com,
	linux-kernel@...r.kernel.org, Kan Liang <kan.liang@...el.com>
Subject: [PATCH RFC V6 5/6] perf,tool: per-event callgraph support

From: Kan Liang <kan.liang@...el.com>

When multiple events are sampled it may not be needed to collect
callgraphs for all of them. The sample sites are usually nearby, and
it's enough to collect the callgraphs on a reference event (such as
precise cycles or precise instructions).
This patchkit adds the ability to turn off callgraphs and time stamp
per event. This in term can reduce sampling overhead and the size of the
perf.data. Furthermore, it makes collecting back traces and timestamps
possible when PEBS threshold > 1, which significantly reducing the
sampling overhead especially for frequently occurring events
(https://lkml.org/lkml/2015/5/10/196). For example, A slower event with
a larger period collects back traces/timestamps. Other more events run
fast with multi-pebs. The time stamps from the slower events can be used
to order the faster events. Their backtraces can give the user enough
hint to find the right spot.

Here are some examples and test results.

1. Comparing the elapsed time and perf.data size from "kernbench -M -H".

 The test command for FULL callgraph and time support.
   "perf record -e
   '{cpu/cpu-cycles,period=100000/,cpu/instructions,period=20000/p}'
   --call-graph fp --time"

 The test command for PARTIAL callgraph and time support.
   "perf record -e
   '{cpu/cpu-cycles,callgraph=fp,time,period=100000/,
     cpu/instructions,callgraph=no,time=0,period=20000/p}'"

 The elapsed time for FULL is 24.3 Sec, while for PARTIAL is 16.9 Sec.
 The perf.data size for FULL is 22.1 Gb, while for PARTIAL is 12.4 Gb.

2. Comparing the perf.data size and callgraph results.

 The test command for FULL callgraph and time support.
   "perf record -e
   '{cpu/cpu-cycles,period=100000/pp,cpu/instructions,period=20000/p}'
   --call-graph fp -- ./tchain_edit"

 The test command for PARTIAL callgraph and time support.
   "perf record -e
   '{cpu/cpu-cycles,callgraph=fp,time,period=100000/pp,
     cpu/instructions,callgraph=no,time=0,period=20000/p}'
   -- ./tchain_edit"

 The perf.data size for FULL is 43.2 MB, while for PARTIAL is 21.1 MB.
 The callgraph is roughly the same.

 The callgraph from FULL
 # Samples: 87K of event
 'cpu/cpu-cycles,callgraph=fp,time,period=100000/pp'
 # Event count (approx.): 8760000000
 #
 # Children      Self  Command      Shared Object       Symbol
 # ........  ........  ...........  ..................
..........................................
 #
    99.98%     0.00%  tchain_edit  libc-2.15.so        [.]
__libc_start_main
            |
            ---__libc_start_main

    99.97%     0.00%  tchain_edit  tchain_edit         [.] main
            |
            ---main
               __libc_start_main

    99.97%     0.00%  tchain_edit  tchain_edit         [.] f1
            |
            ---f1
               main
               __libc_start_main

    99.85%    87.01%  tchain_edit  tchain_edit         [.] f3
            |
            ---f3
               |
               |--99.74%-- f2
               |          f1
               |          main
               |          __libc_start_main
                --0.26%-- [...]
    99.71%     0.12%  tchain_edit  tchain_edit         [.] f2
            |
            ---f2
               f1
               main
               __libc_start_main

 The callgraph from PARTIAL
 # Samples: 417K of event
 'cpu/instructions,callgraph=no,time=0,period=20000/p'
 # Event count (approx.): 8346980000
 #
 # Children      Self  Command      Shared Object     Symbol
 # ........  ........  ...........  ................
..........................................
 #
    98.82%     0.00%  tchain_edit  libc-2.15.so      [.]
__libc_start_main
            |
            ---__libc_start_main

    98.82%     0.00%  tchain_edit  tchain_edit       [.] main
            |
            ---main
               __libc_start_main

    98.82%     0.00%  tchain_edit  tchain_edit       [.] f1
            |
            ---f1
               main
               __libc_start_main

    98.82%    98.28%  tchain_edit  tchain_edit       [.] f3
            |
            ---f3
               |
               |--0.53%-- f2
               |          f1
               |          main
               |          __libc_start_main
               |
               |--0.01%-- f1
               |          main
               |          __libc_start_main
                --99.46%-- [...]
    97.63%     0.03%  tchain_edit  tchain_edit       [.] f2
            |
            ---f2
               f1
               main
               __libc_start_main

     7.13%     0.03%  tchain_edit  [kernel.vmlinux]  [k] do_nmi
            |
            ---do_nmi
               end_repeat_nmi
               f3
               f2
               f1
               main
               __libc_start_main

Signed-off-by: Kan Liang <kan.liang@...el.com>
---
 tools/perf/Documentation/perf-record.txt |  4 +++
 tools/perf/util/evsel.c                  | 61 ++++++++++++++++++++++++++++----
 tools/perf/util/evsel.h                  |  4 +++
 tools/perf/util/parse-events.c           | 12 +++++++
 tools/perf/util/parse-events.h           |  2 ++
 tools/perf/util/parse-events.l           |  2 ++
 tools/perf/util/pmu.c                    |  3 +-
 7 files changed, 80 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 0d852d1..becf11d 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -52,6 +52,10 @@ OPTIONS
 	  - 'time': Disable/enable time stamping. Acceptable values are 1 for
 		    enabling time stamping. 0 for disabling time stamping.
 		    The default is 1.
+	  - 'callgraph': Disable/enable callgraph. Acceptable str are "fp" for
+			 FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
+			 "no" for disable callgraph.
+	  - 'stack_size': user stack size for dwarf mode
 	  Note: If user explicitly sets options which conflict with the params,
 	  the value set by the params will be overridden.
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 7febfe2..874de7d 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -545,14 +545,15 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
 
 static void
 perf_evsel__config_callgraph(struct perf_evsel *evsel,
-			     struct record_opts *opts)
+			     struct record_opts *opts,
+			     struct callchain_param *param)
 {
 	bool function = perf_evsel__is_function_event(evsel);
 	struct perf_event_attr *attr = &evsel->attr;
 
 	perf_evsel__set_sample_bit(evsel, CALLCHAIN);
 
-	if (callchain_param.record_mode == CALLCHAIN_LBR) {
+	if (param->record_mode == CALLCHAIN_LBR) {
 		if (!opts->branch_stack) {
 			if (attr->exclude_user) {
 				pr_warning("LBR callstack option is only available "
@@ -568,12 +569,12 @@ perf_evsel__config_callgraph(struct perf_evsel *evsel,
 				    "Falling back to framepointers.\n");
 	}
 
-	if (callchain_param.record_mode == CALLCHAIN_DWARF) {
+	if (param->record_mode == CALLCHAIN_DWARF) {
 		if (!function) {
 			perf_evsel__set_sample_bit(evsel, REGS_USER);
 			perf_evsel__set_sample_bit(evsel, STACK_USER);
 			attr->sample_regs_user = PERF_REGS_MASK;
-			attr->sample_stack_user = callchain_param.dump_size;
+			attr->sample_stack_user = param->dump_size;
 			attr->exclude_callchain_user = 1;
 		} else {
 			pr_info("Cannot use DWARF unwind for function trace event,"
@@ -587,11 +588,18 @@ perf_evsel__config_callgraph(struct perf_evsel *evsel,
 	}
 }
 
-static void apply_config_terms(struct perf_evsel *evsel)
+static void apply_config_terms(struct perf_evsel *evsel,
+			       struct record_opts *opts)
 {
 	struct perf_evsel_config_term *term;
 	struct list_head *config_terms = &evsel->config_terms;
 	struct perf_event_attr *attr = &evsel->attr;
+	struct callchain_param param;
+	bool callgraph_set = false;
+
+	/* callgraph default */
+	param.record_mode = callchain_param.record_mode;
+	param.dump_size = 8192;
 
 	list_for_each_entry(term, config_terms, list) {
 		switch (term->type) {
@@ -604,10 +612,49 @@ static void apply_config_terms(struct perf_evsel *evsel)
 			else
 				perf_evsel__reset_sample_bit(evsel, TIME);
 			break;
+		case PERF_EVSEL__CONFIG_TERM_CALLGRAPH:
+			if (!strcmp(term->val.callgraph, "fp")) {
+				param.enabled = true;
+				param.record_mode = CALLCHAIN_FP;
+			} else if (!strcmp(term->val.callgraph, "dwarf")) {
+				param.enabled = true;
+				param.record_mode = CALLCHAIN_DWARF;
+			} else if (!strcmp(term->val.callgraph, "lbr")) {
+				param.enabled = true;
+				param.record_mode = CALLCHAIN_LBR;
+			} else if (!strcmp(term->val.callgraph, "no")) {
+				param.enabled = false;
+			} else {
+				pr_warning("%s is no valid callchain type.\n", term->val.callgraph);
+			}
+			callgraph_set = true;
+			break;
+		case PERF_EVSEL__CONFIG_TERM_STACK_USER:
+			param.dump_size = term->val.stack_user;
+			callgraph_set = true;
+			break;
 		default:
 			break;
 		}
 	}
+
+	/* User explicitly set perf-event callgraph, clear the old setting and reset. */
+	if (callgraph_set) {
+		if (callchain_param.enabled) {
+			perf_evsel__reset_sample_bit(evsel, CALLCHAIN);
+			if (callchain_param.record_mode == CALLCHAIN_LBR) {
+				perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
+				attr->branch_sample_type &= ~(PERF_SAMPLE_BRANCH_USER |
+							      PERF_SAMPLE_BRANCH_CALL_STACK);
+			}
+			if (callchain_param.record_mode == CALLCHAIN_DWARF) {
+				perf_evsel__reset_sample_bit(evsel, REGS_USER);
+				perf_evsel__reset_sample_bit(evsel, STACK_USER);
+			}
+		}
+		if (param.enabled)
+			perf_evsel__config_callgraph(evsel, opts, &param);
+	}
 }
 
 /*
@@ -714,7 +761,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 		evsel->attr.exclude_callchain_user = 1;
 
 	if (callchain_param.enabled && !evsel->no_aux_samples)
-		perf_evsel__config_callgraph(evsel, opts);
+		perf_evsel__config_callgraph(evsel, opts, &callchain_param);
 
 	if (opts->sample_intr_regs) {
 		attr->sample_regs_intr = PERF_REGS_MASK;
@@ -806,7 +853,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts)
 	 * Apply event specific term settings,
 	 * it overloads any global configuration.
 	 */
-	apply_config_terms(evsel);
+	apply_config_terms(evsel, opts);
 }
 
 static int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6a12908..09a3022 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -40,6 +40,8 @@ struct cgroup_sel;
 enum {
 	PERF_EVSEL__CONFIG_TERM_PERIOD,
 	PERF_EVSEL__CONFIG_TERM_TIME,
+	PERF_EVSEL__CONFIG_TERM_CALLGRAPH,
+	PERF_EVSEL__CONFIG_TERM_STACK_USER,
 	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
@@ -49,6 +51,8 @@ struct perf_evsel_config_term {
 	union {
 		u64	period;
 		bool	time;
+		char	*callgraph;
+		u64	stack_user;
 	} val;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index b10a5c0..63e08e7 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -608,6 +608,12 @@ do {									   \
 		if (term->val.num > 1)
 			return -EINVAL;
 		break;
+	case PARSE_EVENTS__TERM_TYPE_CALLGRAPH:
+		CHECK_TYPE_VAL(STR);
+		break;
+	case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
+		CHECK_TYPE_VAL(NUM);
+		break;
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 		CHECK_TYPE_VAL(STR);
 		break;
@@ -659,6 +665,12 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_TIME:
 			ADD_CONFIG_TERM(TIME, time, term->val.num);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_CALLGRAPH:
+			ADD_CONFIG_TERM(CALLGRAPH, callgraph, term->val.str);
+			break;
+		case PARSE_EVENTS__TERM_TYPE_STACKSIZE:
+			ADD_CONFIG_TERM(STACK_USER, stack_user, term->val.num);
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index e6f9aacc..87dc9f6 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -64,6 +64,8 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD,
 	PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE,
 	PARSE_EVENTS__TERM_TYPE_TIME,
+	PARSE_EVENTS__TERM_TYPE_CALLGRAPH,
+	PARSE_EVENTS__TERM_TYPE_STACKSIZE,
 };
 
 struct parse_events_term {
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index f542750..16af73b 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -184,6 +184,8 @@ name			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NAME); }
 period			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD); }
 branch_type		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_BRANCH_SAMPLE_TYPE); }
 time			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_TIME); }
+callgraph		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CALLGRAPH); }
+stack_size		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_TIME); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index b615cdf..586b9fd 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -607,7 +607,8 @@ static char *formats_error_string(struct list_head *formats)
 {
 	struct perf_pmu_format *format;
 	char *err, *str;
-	static const char *static_terms = "config,config1,config2,name,period,branch_type,time\n";
+	static const char *static_terms = "config,config1,config2,name,period,"
+					  "branch_type,time,callgraph,stack_size\n";
 	unsigned i = 0;
 
 	if (!asprintf(&str, "valid terms:"))
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ