[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBSG2iHVe01Qex=8zhcE0j=TwzUotADzN3HL9Jq7omnf0A@mail.gmail.com>
Date: Tue, 13 Oct 2015 17:39:07 -0700
From: Stephane Eranian <eranian@...gle.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"mingo@...e.hu" <mingo@...e.hu>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Subject: Re: [PATCH 2/4] perf/x86: add support for PERF_SAMPLE_BRANCH_CALL
On Tue, Oct 13, 2015 at 6:40 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
>
> * Stephane Eranian <eranian@...gle.com> wrote:
>
> > This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL
> > for Intel x86 processors. When the processor support LBR filtering
> > this the selection is done in hardware. Otherwise, the filter is
> > applied by software. Note that we chose to include zero length calls
> > because they also represent calls.
> >
> > Signed-off-by: Stephane Eranian <eranian@...gle.com>
> > ---
> > arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > index ad0b8b0..bfd0b71 100644
> > --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> > @@ -555,6 +555,8 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
> > if (br_type & PERF_SAMPLE_BRANCH_IND_JUMP)
> > mask |= X86_BR_IND_JMP;
> >
> > + if (br_type & PERF_SAMPLE_BRANCH_CALL)
> > + mask |= X86_BR_CALL | X86_BR_ZERO_CALL;
>
> I'm wondering how frequent zero-length calls are. If they still occur in typical
> user-space, would it make sense to also have a separate branch sampling type for
> zero length calls?
>
We could add that. It would rely on the sw filter to catch only the
zero calls as Andi
mentioned. But I am wondering about the data quality because we would catch zero
calls without being able to determine how many we sampled vs. how many have
occurred. There is no PMU event counting zero call branches.
> Intel documents zero length calls as ones that (ab-)use the call instruction to
> push the current IP on the stack:
>
> call next_addr
> next_addr:
> pop %reg
>
> which can take over 10 cycles on certain microarchitectures (and it unbalances
> whatever call stack tracking/caching the CPU does as well).
>
> So it might make sense to analyze them separately. I guess that's the reason why
> Intel added a separate flag for them in the PMU.
>
> Thanks,
>
> Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists