lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51CC708B.7040605@redhat.com>
Date:	Thu, 27 Jun 2013 13:04:11 -0400
From:	Prarit Bhargava <prarit@...hat.com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	Linux Kernel <linux-kernel@...r.kernel.org>, athorlton@....com,
	CAI Qian <caiqian@...hat.com>
Subject: Re: BUG: tick device NULL pointer during system initialization and
 shutdown



On 06/26/2013 07:05 AM, Thomas Gleixner wrote:
> On Tue, 25 Jun 2013, Prarit Bhargava wrote:
>> On 06/24/2013 09:57 AM, Thomas Gleixner wrote:
>>> Does the patch below fix it?
>>>
>>
>> Thomas,
>>
>> Thanks for the patch.
>>
>> The reproducibility appears to be quite low.  I'm seeing this roughly 1 time
>> every six hours of continuous system reboots.  I'm testing right now with your
>> patch.  I'll update the thread in a couple of days...
> 
> I have a proper version of that patch now along with an explanation of
> the failure.
> 
> -------------------->
> 
> Subject: tick: Make oneshot broadcast robust vs. CPU offlining
> From: Thomas Gleixner <tglx@...utronix.de>
> Date: Wed, 26 Jun 2013 12:17:32 +0200
> 
> In periodic mode we remove offline cpus from the broadcast propagation
> mask. In oneshot mode we fail to do so. This was not a problem so far,
> but the recent changes to the broadcast propagation introduced a
> constellation which can result in a NULL pointer dereference.
> 

Unfortunately this patch causes an NMI watchdog during system shutdown.  Most of
the CPUs are in start_secondary+0x254/0x256.

CPU 0, however, is

[  270.579581] NMI backtrace for cpu 0^M
[  270.583480] CPU: 0 PID: 595 Comm: kworker/0:2 Not tainted 3.10.0-rc4+ #2^M
[  270.590954] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS
QSSC-S4R.QCI.01.00.T030.072620111404 07/26/2011^M
[  270.601345] task: ffff880851c50000 ti: ffff880851c72000 task.ti:
ffff880851c72000^M
[  270.609691] RIP: 0010:[<ffffffff8109a8c0>]  [<ffffffff8109a8c0>]
update_cfs_shares+0xf0/0xf0^M
[  270.619126] RSP: 0018:ffff880851c73d78  EFLAGS: 00000086^M
[  270.625049] RAX: ffffffff81626180 RBX: ffff880851c50048 RCX: 0000000000000000^M
[  270.633007] RDX: 0000000000000001 RSI: ffff880851c50048 RDI: ffff88085f414670^M
[  270.640965] RBP: ffff880851c73dc0 R08: 0000003effcc9cfd R09: 0000000000000000^M
[  270.648923] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88085f414670^M
[  270.656881] R13: ffff88085f414600 R14: 0000000000000001 R15: 0000000000000001^M
[  270.664841] FS:  0000000000000000(0000) GS:ffff88085f400000(0000)
knlGS:0000000000000000^M
[  270.673865] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[  270.680272] CR2: 00000000000000b8 CR3: 00000000018f8000 CR4: 00000000000007f0^M
[  270.688229] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
[  270.696188] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400^M
[  270.704146] Stack:^M
[  270.706388]  ffffffff8109b019 ffff88085f414600 ffff88085f414600
0000000000000000^M
[  270.714684]  ffff88085f414600 ffff88085f414600 0000000000000000
ffff880851c50000^M
[  270.722981]  ffff8808521ec700 ffff880851c73de8 ffffffff8108ed39
0000000168d36c00^M
[  270.731276] Call Trace:^M
[  270.734007]  [<ffffffff8109b019>] ? dequeue_task_fair+0x59/0x640^M
[  270.740713]  [<ffffffff8108ed39>] dequeue_task+0x79/0xa0^M
[  270.746638]  [<ffffffff81091be3>] deactivate_task+0x23/0x30^M
[  270.752857]  [<ffffffff816023f9>] __schedule+0x589/0x7d0^M
[  270.758782]  [<ffffffff81602669>] schedule+0x29/0x70^M
[  270.764323]  [<ffffffff8107de03>] worker_thread+0x1c3/0x3a0^M
[  270.770541]  [<ffffffff8107dc40>] ? rescuer_thread+0x350/0x350^M
[  270.777041]  [<ffffffff81084300>] kthread+0xc0/0xd0^M
[  270.782474]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40^M
[  270.789272]  [<ffffffff8160c56c>] ret_from_fork+0x7c/0xb0^M
[  270.795295]  [<ffffffff81084240>] ? insert_kthread_work+0x40/0x40^M

and CPU63 is doing the back trace:

[  272.655049] CPU: 63 PID: 0 Comm: swapper/63 Not tainted 3.10.0-rc4+ #2^M
[  272.662331] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS
QSSC-S4R.QCI.01.00.T030.072620111404 07/26/2011^M
[  272.672714] task: ffff880854df4de0 ti: ffff880854e02000 task.ti:
ffff880854e02000^M
[  272.681062] RIP: 0010:[<ffffffff812f3c82>]  [<ffffffff812f3c82>]
delay_tsc+0x32/0x80^M
[  272.689720] RSP: 0018:ffff88106f3c3dd0  EFLAGS: 00000083^M
[  272.695647] RAX: 000000000000009e RBX: 00000000cea08f3d RCX: 0000000000000001^M
[  272.703607] RDX: 00000000cea08fdb RSI: 0000000000000050 RDI: 00000000001e7000^M
[  272.711569] RBP: ffff88106f3c3de8 R08: ffffffff81a02928 R09: 000000000000070e^M
[  272.719529] R10: 0000000000000000 R11: ffff88106f3c3b46 R12: 00000000001e7000^M
[  272.727491] R13: 000000000000003f R14: ffff88106f3cec80 R15: ffffffff81949480^M
[  272.735452] FS:  0000000000000000(0000) GS:ffff88106f3c0000(0000)
knlGS:0000000000000000^M
[  272.744470] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[  272.750879] CR2: 00007f114a8f7920 CR3: 0000000c61b5f000 CR4: 00000000000007e0^M
[  272.758841] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M
[  272.766801] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400^M
[  272.774759] Stack:^M
[  272.777001]  0000000000002710 ffffffff81949300 ffffffff81949000
ffff88106f3c3df8^M
[  272.785303]  ffffffff812f3be8 ffff88106f3c3e10 ffffffff81036faa
ffffffff81a02ba0^M
[  272.793605]  ffff88106f3c3e70 ffffffff810f8060 0000000354df4de0
0000000000000242^M
[  272.801908] Call Trace:^M
[  272.804634]  <IRQ> ^M
[  272.806782]  [<ffffffff812f3be8>] __const_udelay+0x28/0x30^M
[  272.813122]  [<ffffffff81036faa>] arch_trigger_all_cpu_backtrace+0x7a/0xa0^M
[  272.820799]  [<ffffffff810f8060>] rcu_check_callbacks+0x5b0/0x600^M
[  272.827603]  [<ffffffff81070217>] update_process_times+0x47/0x80^M
[  272.834313]  [<ffffffff810b94f5>] tick_sched_handle.isra.15+0x25/0x60^M
[  272.841500]  [<ffffffff810b9571>] tick_sched_timer+0x41/0x60^M
[  272.847821]  [<ffffffff81087c74>] __run_hrtimer+0x74/0x1d0^M
[  272.853943]  [<ffffffff810b9530>] ? tick_sched_handle.isra.15+0x60/0x60^M
[  272.861325]  [<ffffffff81088457>] hrtimer_interrupt+0xf7/0x240^M
[  272.867841]  [<ffffffff8160e429>] smp_apic_timer_interrupt+0x69/0x9c^M
[  272.874933]  [<ffffffff8160d29d>] apic_timer_interrupt+0x6d/0x80^M
[  272.881634]  <EOI> ^M
[  272.883781]  [<ffffffff810b0432>] ? cpu_startup_entry+0x132/0x230^M
[  272.890803]  [<ffffffff810b0400>] ? cpu_startup_entry+0x100/0x230^M
[  272.897605]  [<ffffffff815ed4e8>] start_secondary+0x254/0x256^M
[  272.904014] Code: 89 e5 41 55 41 54 41 89 fc 53 65 44 8b 2c 25 1c b0 00 00 66
66 90 0f ae e8 e8 5b 46 d2 ff 66 90 89 c3 eb 14 0f 1f 44 00 00 f3 90 <65> 8b 04
25 1c b0 00 00 41 39 c5 75 1d 66 66 90 0f ae e8 e8 36 ^M

P.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ