linux-kernel - Re: perf,arm -- oops in validate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130806115921.GA14798@mudshark.cambridge.arm.com>
Date:	Tue, 6 Aug 2013 12:59:21 +0100
From:	Will Deacon <will.deacon@....com>
To:	Mark Rutland <mark.rutland@....com>
Cc:	Vince Weaver <vincent.weaver@...ne.edu>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	"trinity@...r.kernel.org" <trinity@...r.kernel.org>
Subject: Re: perf,arm -- oops in validate_event

On Tue, Aug 06, 2013 at 12:19:32PM +0100, Mark Rutland wrote:
> On Mon, Aug 05, 2013 at 10:17:37PM +0100, Vince Weaver wrote:
> > It looks like in validate_event() we do
> > 
> >         struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> >         ...
> >         return armpmu->get_event_idx(hw_events, event) >= 0;
> > 
> > armpmu is read into r3, and somehow the value at the offset of
> > armpmu->get_event_idx is either -1 or 0, so when it does a "blx" 
> > branch to the address at this offset we get the ooops.
> > 
> >   c001bf8c:       e3120010        tst     r2, #16
> >   c001bf90:       0a000004        beq     c001bfa8 <validate_event+0x48>
> >   c001bf94:       e5933070        ldr     r3, [r3, #112]  ; 0x70
> > * c001bf98:       e12fff33        blx     r3
> >   c001bf9c:       e1e00000        mvn     r0, r0
> > 
> > I'm having trouble tracing the code back past that, and I don't have time
> > to start adding printk's and recompiling right now.
> > 
> > Vince
> 
> I think I can save you the effort :)
> 
> From the looks of the test case and the kernel code in question, it
> looks like the following happens:
> 
> * We create a software event, which becomes its own group leader.
> * We create a hardware event, with the software event as its group
>   leader.
> * When we try to schedule the hardware event, we try to validate all
>   events in its event group (the leader + siblings), but in doing so we
>   treat the software event as a hardware event, and erroneously try to
>   get its (non-existent) arm_pmu container, and call some garbage value
>   as get_event_idx(...).
> 
> This could also happen if we tried to add events from different hardware
> PMUs to the same groups. I'm not sure if that's valid, but I couldn't
> see any code preventing that, and it seems the x86 validation logic is
> wired to allow this. If it's not valid, we could skip validation of
> software events by checking with is_software_event.

But we already check `event->pmu != leader_pmu' in validate_event, so we
shouldn't get anywhere nearer calling get_event_idx in the case you
describe. It sounds more like we have an inconsistency with one of the
events.

Can you dump the events as they're processed in validate_group please?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/