[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130806115921.GA14798@mudshark.cambridge.arm.com>
Date: Tue, 6 Aug 2013 12:59:21 +0100
From: Will Deacon <will.deacon@....com>
To: Mark Rutland <mark.rutland@....com>
Cc: Vince Weaver <vincent.weaver@...ne.edu>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...hat.com>,
Paul Mackerras <paulus@...ba.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
"trinity@...r.kernel.org" <trinity@...r.kernel.org>
Subject: Re: perf,arm -- oops in validate_event
On Tue, Aug 06, 2013 at 12:19:32PM +0100, Mark Rutland wrote:
> On Mon, Aug 05, 2013 at 10:17:37PM +0100, Vince Weaver wrote:
> > It looks like in validate_event() we do
> >
> > struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> > ...
> > return armpmu->get_event_idx(hw_events, event) >= 0;
> >
> > armpmu is read into r3, and somehow the value at the offset of
> > armpmu->get_event_idx is either -1 or 0, so when it does a "blx"
> > branch to the address at this offset we get the ooops.
> >
> > c001bf8c: e3120010 tst r2, #16
> > c001bf90: 0a000004 beq c001bfa8 <validate_event+0x48>
> > c001bf94: e5933070 ldr r3, [r3, #112] ; 0x70
> > * c001bf98: e12fff33 blx r3
> > c001bf9c: e1e00000 mvn r0, r0
> >
> > I'm having trouble tracing the code back past that, and I don't have time
> > to start adding printk's and recompiling right now.
> >
> > Vince
>
> I think I can save you the effort :)
>
> From the looks of the test case and the kernel code in question, it
> looks like the following happens:
>
> * We create a software event, which becomes its own group leader.
> * We create a hardware event, with the software event as its group
> leader.
> * When we try to schedule the hardware event, we try to validate all
> events in its event group (the leader + siblings), but in doing so we
> treat the software event as a hardware event, and erroneously try to
> get its (non-existent) arm_pmu container, and call some garbage value
> as get_event_idx(...).
>
> This could also happen if we tried to add events from different hardware
> PMUs to the same groups. I'm not sure if that's valid, but I couldn't
> see any code preventing that, and it seems the x86 validation logic is
> wired to allow this. If it's not valid, we could skip validation of
> software events by checking with is_software_event.
But we already check `event->pmu != leader_pmu' in validate_event, so we
shouldn't get anywhere nearer calling get_event_idx in the case you
describe. It sounds more like we have an inconsistency with one of the
events.
Can you dump the events as they're processed in validate_group please?
Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists