[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBTdJ8Qy0=3fnhqfj1anU_6R9tGqHGTQCO_SCHphqypJaA@mail.gmail.com>
Date: Fri, 6 Mar 2015 14:51:53 -0500
From: Stephane Eranian <eranian@...gle.com>
To: Vince Weaver <vincent.weaver@...ne.edu>
Cc: Andi Kleen <andi@...stfloor.org>, Jiri Olsa <jolsa@...hat.com>,
"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>,
linux-man@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Mackerras <paulus@...ba.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Chuck Ebbert <cebbert.lkml@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [patch] perf_event_open.2: 3.19 PERF_SAMPLE_REGS_INTR support
On Fri, Mar 6, 2015 at 1:37 PM, Vince Weaver <vincent.weaver@...ne.edu> wrote:
> On Mon, 2 Mar 2015, Andi Kleen wrote:
>
>> > do not enable REGS_USER and REG_INTR at the same time
>> > as REGS_USER will have REG_INTR values and
>> > cannot be used for user stack unwinding
>>
>> If that's true it would be a bug. But I doubt it.
>>
>> The PEBS handler sets up its own pt_regs, so they should
>> be independent.
>
> I could be wrong here, but was tracing through the code.
>
> If you trigger a PEBS interrupt (because you have precise_ip set)
> and you have both REGS_USER and REGS_INTR set, then
> __intel_pmu_pebs_event()
> is called from
> arch/x86/kernel/cpu/perf_event_intel_ds.c
>
> and in there it sets the regs values based solely on
>
> if (sample_type & PERF_SAMPLE_REGS_INTR) {
> }
>
> with those values copies into regs and then passed upstream through
> perf_event_overflow()
>
> so if the sample_type has *both* PERF_SAMPLE_REGS_INTR and
> PERF_SAMPLE_REGS_USER set, then the PERF_SAMPLE_REGS_USER values
> will have the same register values as the PERF_SAMPLE_REGS_INTR values.
>
> Maybe this is the expected behavior, or maybe I am missing something
> still.
>
If you look at perf_sample_regs_user() is has 3 pt_regs. If interrupt occurred
while in user mode, then regs_users get regs. And those could have been updated
by PEBS if REGS_INTR is set. The question is: is this valid?
If PEBS is one entry, then you'd get the state at retirement of the
sampled instruction.
The interrupt would come a bit later. the pt_regs reflects user mode,
thus either the
sampled instruction was still in user mode or it was in kernel mode.
In the later case,
this is a problem because you are reporting kernel state for REG_USER.
In the former
case, you'd report state for an instruction that is retired early that
where the interrupt hit.
It boils down to the definition of REGS_USER? Is that last know user
level state, interrupted
user state?
For REGS_INTR:
- precise_ip = 0: machine state at PMU interrupt
- precise_ip > 0: machine state at retirement of PEBS sampled instruction
For REGS_USER:
- precise_ip = 0: last known user level machine state on PMU interrupt
- precise_ip > 0:
- interrupt hit in user space: machine state at retirement of
PEBS sampled instruction
- interrupt hit in kernel space: last known user level machine
state on PMU interrupt
At least, that's how I think it currently works.
Do you agree, Vince?
> Vince
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists