[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101024112540.GA21267@elte.hu>
Date: Sun, 24 Oct 2010 13:25:40 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Jason Baron <jbaron@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
masami.hiramatsu.pt@...achi.com
Subject: Re: [PATCH][GIT PULL] tracing: Fix compile issue for
trace_sched_wakeup.c
* Steven Rostedt <rostedt@...dmis.org> wrote:
> On Sat, 2010-10-23 at 22:02 +0200, Ingo Molnar wrote:
> > * Jason Baron <jbaron@...hat.com> wrote:
> >
> > > > Not the same config, and it's very spurious - i.e. a slightly different -tip
> > > > version with the same config will boot fine. (this suggests some race)
> > >
> > > if possible, can you post that .config?
> >
> > I just reproduced it again with tip-1128a72 - config and full bootlog attached.
> >
> > The crash picture tends to vary - sometimes it crashes in fork, sometimes in the
> > timer interrupt. Here's the current one:
> >
> > [ 15.384483] Running tests on trace events:
> > [ 15.388580] Testing event kfree_skb:
> > [ 15.392381] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [ 15.395408] IP: [<(null)>] (null)
>
> Interesting, the jump was to NULL. I'm thinking it hit a trace point and
> jumped to a NULL address. I guess there's some strange race here. Is a
> cache flush missing somewhere. I'll look more into this on Monday.
NULL wasnt the only crash i've seen in the past though, here's an older one:
[ 4.983527] Running tests on all trace events:
[ 4.988002] Testing all events:
[ 5.001006] BUG: unable to handle kernel paging request at 7d693ae5
[ 5.001999] IP: [<bf206c23>] 0xbf206c23
[ 5.001999] *pde = 00000000
[ 5.001999] Oops: 0002 [#1] SMP
[ 5.001999] last sysfs file:
[ 5.001999] Modules linked in:
[ 5.001999]
[ 5.001999] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36-rc7-tip+ #48497 /
[ 5.001999] EIP: 0060:[<bf206c23>] EFLAGS: 00010082 CPU: 1
[ 5.001999] EIP is at 0xbf206c23
[ 5.001999] EAX: bf206c25 EBX: 25a98103 ECX: 0001ba00 EDX: 00000000
[ 5.001999] ESI: be48cec0 EDI: 813cba88 EBP: bf206c00 ESP: bec89ee0
[ 5.001999] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Another one was:
[ 6.980461] Testing event hrtimer_expire_entry:
[ 7.000007] BUG: unable to handle kernel paging request at a0fe7dfc
[ 7.004000] IP: [<c101b631>] __ticket_spin_lock+0x5/0x15
[ 7.004000] *pde = 00000000
[ 7.004000] Oops: 0002 [#1] SMP
[ 7.004000] last sysfs file:
[ 7.004000] Modules linked in:
[ 7.004000]
[ 7.004000] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36-rc7-tip-01858-g336fdd2-dirty #48488 A8N-E/System Product Name
[ 7.004000] EIP: 0060:[<c101b631>] EFLAGS: 00010002 CPU: 1
[ 7.004000] EIP is at __ticket_spin_lock+0x5/0x15
[ 7.004000] Call Trace:
[ 7.004000] [<c127e84c>] ? _raw_spin_lock+0x5/0x7
[ 7.004000] [<c1044699>] ? hrtimer_run_queues+0x1af/0x1fd
[ 7.004000] [<c1036da9>] ? run_local_timers+0x5/0xf
[ 7.004000] [<c1036dd4>] ? update_process_times+0x21/0x43
[ 7.004000] [<c104be84>] ? tick_handle_periodic+0x14/0x68
[ 7.004000] [<c1015c84>] ? smp_apic_timer_interrupt+0x66/0x75
[ 7.004000] [<c127f0ff>] ? apic_timer_interrupt+0x2f/0x34
[ 7.004000] [<c101afc4>] ? native_safe_halt+0x2/0x3
[ 7.004000] [<c10081c8>] ? default_idle+0x66/0x91
[ 7.004000] [<c1001901>] ? cpu_idle+0x53/0x9a
so i'd suggest to not limit things to a NULL overwrite alone.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists