[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180108121423.GI3040@hirez.programming.kicks-ass.net>
Date: Mon, 8 Jan 2018 13:14:23 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Jiri Olsa <jolsa@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
lkml <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
David Ahern <dsahern@...il.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>
Subject: Re: [PATCH 03/12] perf: Allocate context task_ctx_data for child
event
On Sun, Jan 07, 2018 at 05:03:47PM +0100, Jiri Olsa wrote:
> Currently we use perf_event_context::task_ctx_data to save
> and restore the LBR status when the task is scheduled out
> and in.
>
> We don't allocate it for child contexts, which results in
> shorter task's LBR stack, because we don't save the history
> from previous run and start over every time we schedule the
> task in.
>
> I made a test to generate samples with LBR call stack
> and got higher numbers on bigger chain depths:
>
> before: after:
> LBR call chain: nr: 1 60561 498127
> LBR call chain: nr: 2 0 0
> LBR call chain: nr: 3 107030 2172
> LBR call chain: nr: 4 466685 62758
> LBR call chain: nr: 5 2307319 878046
> LBR call chain: nr: 6 48713 495218
> LBR call chain: nr: 7 1040 4551
> LBR call chain: nr: 8 481 172
> LBR call chain: nr: 9 878 120
> LBR call chain: nr: 10 2377 6698
> LBR call chain: nr: 11 28830 151487
> LBR call chain: nr: 12 29347 339867
> LBR call chain: nr: 13 4 22
> LBR call chain: nr: 14 3 53
Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Fixes: 4af57ef28c2c ("perf: Add pmu specific data for perf task context")
> Cc: Andi Kleen <ak@...ux.intel.com>
> Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> ---
> kernel/events/core.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4df5b695bf0d..55fb648a32b0 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10703,6 +10703,19 @@ inherit_event(struct perf_event *parent_event,
> if (IS_ERR(child_event))
> return child_event;
>
> +
> + if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) &&
> + !child_ctx->task_ctx_data) {
> + struct pmu *pmu = child_event->pmu;
> +
> + child_ctx->task_ctx_data = kzalloc(pmu->task_ctx_size,
> + GFP_KERNEL);
> + if (!child_ctx->task_ctx_data) {
> + free_event(child_event);
> + return NULL;
> + }
> + }
> +
> /*
> * is_orphaned_event() and list_add_tail(&parent_event->child_list)
> * must be under the same lock in order to serialize against
> @@ -10713,6 +10726,7 @@ inherit_event(struct perf_event *parent_event,
> if (is_orphaned_event(parent_event) ||
> !atomic_long_inc_not_zero(&parent_event->refcount)) {
> mutex_unlock(&parent_event->child_mutex);
> + /* task_ctx_data is freed with child_ctx */
> free_event(child_event);
> return NULL;
> }
> --
> 2.13.6
>
Powered by blists - more mailing lists