linux-kernel - Re: [PATCH][GIT PULL] tracing: Fix compile issue for trace_sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101024112540.GA21267@elte.hu>
Date:	Sun, 24 Oct 2010 13:25:40 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Jason Baron <jbaron@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	masami.hiramatsu.pt@...achi.com
Subject: Re: [PATCH][GIT PULL] tracing: Fix compile issue for
 trace_sched_wakeup.c


* Steven Rostedt <rostedt@...dmis.org> wrote:

> On Sat, 2010-10-23 at 22:02 +0200, Ingo Molnar wrote:
> > * Jason Baron <jbaron@...hat.com> wrote:
> > 
> > > > Not the same config, and it's very spurious - i.e. a slightly different -tip 
> > > > version with the same config will boot fine. (this suggests some race)
> > > 
> > > if possible, can you post that .config?
> > 
> > I just reproduced it again with tip-1128a72 - config and full bootlog attached.
> > 
> > The crash picture tends to vary - sometimes it crashes in fork, sometimes in the 
> > timer interrupt. Here's the current one:
> > 
> > [   15.384483] Running tests on trace events:
> > [   15.388580] Testing event kfree_skb: 
> > [   15.392381] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [   15.395408] IP: [<(null)>] (null)
> 
> Interesting, the jump was to NULL. I'm thinking it hit a trace point and
> jumped to a NULL address. I guess there's some strange race here. Is a
> cache flush missing somewhere. I'll look more into this on Monday.

NULL wasnt the only crash i've seen in the past though, here's an older one:

 [    4.983527] Running tests on all trace events:
 [    4.988002] Testing all events:
 [    5.001006] BUG: unable to handle kernel paging request at 7d693ae5
 [    5.001999] IP: [<bf206c23>] 0xbf206c23
 [    5.001999] *pde = 00000000
 [    5.001999] Oops: 0002 [#1] SMP
 [    5.001999] last sysfs file:
 [    5.001999] Modules linked in:
 [    5.001999]
 [    5.001999] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36-rc7-tip+ #48497 /
 [    5.001999] EIP: 0060:[<bf206c23>] EFLAGS: 00010082 CPU: 1
 [    5.001999] EIP is at 0xbf206c23
 [    5.001999] EAX: bf206c25 EBX: 25a98103 ECX: 0001ba00 EDX: 00000000
 [    5.001999] ESI: be48cec0 EDI: 813cba88 EBP: bf206c00 ESP: bec89ee0
 [    5.001999]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068

Another one was:

 [    6.980461] Testing event hrtimer_expire_entry:
 [    7.000007] BUG: unable to handle kernel paging request at a0fe7dfc
 [    7.004000] IP: [<c101b631>] __ticket_spin_lock+0x5/0x15
 [    7.004000] *pde = 00000000
 [    7.004000] Oops: 0002 [#1] SMP
 [    7.004000] last sysfs file:
 [    7.004000] Modules linked in:
 [    7.004000]
 [    7.004000] Pid: 0, comm: kworker/0:0 Not tainted 2.6.36-rc7-tip-01858-g336fdd2-dirty #48488 A8N-E/System Product Name
 [    7.004000] EIP: 0060:[<c101b631>] EFLAGS: 00010002 CPU: 1
 [    7.004000] EIP is at __ticket_spin_lock+0x5/0x15
 [    7.004000] Call Trace:
 [    7.004000]  [<c127e84c>] ? _raw_spin_lock+0x5/0x7
 [    7.004000]  [<c1044699>] ? hrtimer_run_queues+0x1af/0x1fd
 [    7.004000]  [<c1036da9>] ? run_local_timers+0x5/0xf
 [    7.004000]  [<c1036dd4>] ? update_process_times+0x21/0x43
 [    7.004000]  [<c104be84>] ? tick_handle_periodic+0x14/0x68
 [    7.004000]  [<c1015c84>] ? smp_apic_timer_interrupt+0x66/0x75
 [    7.004000]  [<c127f0ff>] ? apic_timer_interrupt+0x2f/0x34
 [    7.004000]  [<c101afc4>] ? native_safe_halt+0x2/0x3
 [    7.004000]  [<c10081c8>] ? default_idle+0x66/0x91
 [    7.004000]  [<c1001901>] ? cpu_idle+0x53/0x9a

so i'd suggest to not limit things to a NULL overwrite alone.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/