[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150615104813.GU3644@twins.programming.kicks-ass.net>
Date: Mon, 15 Jun 2015 12:48:13 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Andi Kleen <andi@...stfloor.org>
Cc: acme@...nel.org, linux-kernel@...r.kernel.org, jolsa@...hat.com,
eranian@...gle.com, Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 2/5] x86, perf: Add option to disable reading branch
flags/cycles
On Wed, May 27, 2015 at 09:13:15PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@...ux.intel.com>
>
> With LBRv5 reading the extra LBR flags like mispredict, TSX, cycles
> is not free anymore, as it has moved to a separate MSR.
>
> For callstack mode we don't need any of this information; so we can
> avoid the unnecessary MSR read. Add flags to the perf interface
> where perf record can request not collecting this information.
> I added sample_type flags for CYCLES and FLAGS. It's a bit unusual for
> sample_types to be negative (disable), not positive (enable), but
> since the legacy ABI reported the flags we need some form of explicit
> disabling to avoid breaking the ABI. In theory it would be possible
> to make CYCLES opt-in (as it's not deployed yet), but I also made it
> opt-out to be symmetric to FLAGS.
>
> After we have the flags the x86 perf code can keep track if any
> users need the flags. If noone needs it the information is not
> collected.
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> index 1fd8b5a..5de2048 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -340,6 +340,10 @@ void intel_pmu_lbr_enable(struct perf_event *event)
> }
>
> cpuc->lbr_users++;
> + if (!(event->attr.sample_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
> + cpuc->lbr_flags_users++;
> + if (!(event->attr.sample_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
> + cpuc->lbr_cycles_users++;
> perf_sched_cb_inc(event->ctx->pmu);
> }
This patch seems to add an unfortunate amount of branches. And while I
appreciate it'll all be cheaper than the SKL MSR read, it does add
overhead to the older chips.
Also, I think its broken. We can now have two events, with the 'same'
LBR config but with different flags/cycles settings.
__intel_shared_reg_get_constraints() will find a match, but the enable
code above will disable flags/cycles for both of them.
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 1b8bd4a..8dd5765 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -138,6 +138,8 @@ enum perf_event_sample_format {
> PERF_SAMPLE_IDENTIFIER = 1U << 16,
> PERF_SAMPLE_TRANSACTION = 1U << 17,
> PERF_SAMPLE_REGS_INTR = 1U << 18,
> + PERF_SAMPLE_BRANCH_NO_FLAGS = 1U << 19,
> + PERF_SAMPLE_BRANCH_NO_CYCLES = 1U << 20,
>
> PERF_SAMPLE_MAX = 1U << 19, /* non-ABI */
> };
Should this not be part of perf_branch_sample_type instead? That seems
to otherwise specify all the branch stack details.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists