lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Sep 2009 20:21:40 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Martin Schwidefsky <schwidefsky@...ibm.com>, mingo@...hat.com,
	hpa@...or.com, linux-kernel@...r.kernel.org, johnstul@...ibm.com,
	linux-tip-commits@...r.kernel.org
Subject: Re: [boot crash] Re: [tip:timers/core] clocksource: Resolve cpu
	hotplug dead lock with TSC unstable


* Ingo Molnar <mingo@...e.hu> wrote:

> -tip testing found the following boot crash on a 64-bit x86 system:
> 
> [    0.405247] initcall spawn_softlockup_task+0x0/0xa5 returned 0 after 0 usecs
> [    0.410004] calling  relay_init+0x0/0x40 @ 1
> [    0.420005] initcall relay_init+0x0/0x40 returned 0 after 0 usecs
> [    0.426355] lockdep: fixing up alternatives.
> [    0.430110] Booting processor 1 APIC 0x1 ip 0x6000
> [    0.030000] Initializing CPU#1
> [    0.030000] masked ExtINT on CPU#1
> [    0.520060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
> [    0.530000] IP: [<ffffffff81071317>] queue_work_on+0x27/0x70
> [    0.530000] PGD 0 
> [    0.530000] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> [    0.530000] last sysfs file: 
> [    0.530000] CPU 0 
> [    0.530000] Modules linked in:
> [    0.530000] Pid: 1, comm: swapper Not tainted 2.6.31-rc8-tip #1613         
> [    0.530000] RIP: 0010:[<ffffffff81071317>]  [<ffffffff81071317>] queue_work_on+0x27/0x70
> [    0.530000] RSP: 0018:ffff880009007d40  EFLAGS: 00010246
> [    0.530000] RAX: 0000000000000000 RBX: ffffffff81e1d0c0 RCX: 0000000000000000
> [    0.530000] RDX: ffffffff824a1fa0 RSI: 0000000000000000 RDI: 0000000000000000
> [    0.530000] RBP: ffff880009007d50 R08: 0000000000000000 R09: 0000000000000000
> [    0.530000] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000025cb39a8
> [    0.530000] R13: ffffffff824a0c40 R14: ffff880009007e50 R15: 0000000000000100
> [    0.530000] FS:  0000000000000000(0000) GS:ffff880009004000(0000) knlGS:0000000000000000
> [    0.530000] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [    0.530000] CR2: 0000000000000020 CR3: 0000000001001000 CR4: 00000000000006b0
> [    0.530000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    0.530000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    0.530000] Process swapper (pid: 1, threadinfo ffff88003f0da000, task ffff88003f0e0000)
> [    0.530000] Stack:
> [    0.530000]  ffffffff824a0c40 00000000b4b7426a ffff880009007d70 ffffffff8107157d
> [    0.530000] <0> ffffffff81083846 00000000b4b7426a ffff880009007d90 ffffffff810715c9
> [    0.530000] <0> 0000000000000000 00000000b4b7426a ffff880009007dc0 ffffffff81083927
> [    0.530000] Call Trace:
> [    0.530000]  <IRQ> 
> [    0.530000]  [<ffffffff8107157d>] queue_work+0x2d/0x50
> [    0.530000]  [<ffffffff81083846>] ? clocksource_watchdog+0x26/0x240
> [    0.530000]  [<ffffffff810715c9>] schedule_work+0x29/0x50
> [    0.530000]  [<ffffffff81083927>] clocksource_watchdog+0x107/0x240
> [    0.530000]  [<ffffffff81065eee>] run_timer_softirq+0x21e/0x380
> [    0.530000]  [<ffffffff81065e29>] ? run_timer_softirq+0x159/0x380
> [    0.530000]  [<ffffffff81083820>] ? clocksource_watchdog+0x0/0x240
> [    0.530000]  [<ffffffff810600aa>] __do_softirq+0x10a/0x200
> [    0.530000]  [<ffffffff8100cb1c>] call_softirq+0x1c/0x90
> [    0.530000]  [<ffffffff8100e955>] do_softirq+0x95/0xc0
> [    0.530000]  [<ffffffff8105f4c5>] irq_exit+0x75/0x90
> [    0.530000]  [<ffffffff81029c20>] smp_apic_timer_interrupt+0x80/0xd0
> [    0.530000]  [<ffffffff8100c4f3>] apic_timer_interrupt+0x13/0x20
> [    0.530000]  <EOI> 
> [    0.530000]  [<ffffffff812ff997>] ? delay_tsc+0x47/0x80
> [    0.530000]  [<ffffffff812ffab4>] ? __const_udelay+0x64/0x80
> [    0.530000]  [<ffffffff818b3fff>] ? do_boot_cpu+0x498/0x6b3
> [    0.530000]  [<ffffffff818b4567>] ? do_fork_idle+0x0/0x56
> [    0.530000]  [<ffffffff810468d2>] ? complete+0x32/0x80
> [    0.530000]  [<ffffffff818b4375>] ? native_cpu_up+0x15b/0x208
> [    0.530000]  [<ffffffff818b69d8>] ? _cpu_up+0xf3/0x1a4
> [    0.530000]  [<ffffffff818b6b4d>] ? cpu_up+0xc4/0xd7
> [    0.530000]  [<ffffffff821d3f97>] ? smp_init+0x118/0x126
> [    0.530000]  [<ffffffff821d40c7>] ? kernel_init+0x82/0xec
> [    0.530000]  [<ffffffff8100ca1a>] ? child_rip+0xa/0x20

btw., the crash itself seems to happen because we got a timer IRQ on 
CPU#0, which tries to queue work to CPU#1 but CPU#1 is not fully 
initialized yet?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ