linux-kernel - Re: [lkp] [fs] df4c0e36f1: NMI watchdog: BUG: soft lockup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPAsAGwVYSgXm0Rgbtim-5iFxhWaakc1emXQALKwYLm36OoVsw@mail.gmail.com>
Date:	Mon, 2 Nov 2015 12:32:55 +0300
From:	Andrey Ryabinin <ryabinin.a.a@...il.com>
To:	kernel test robot <ying.huang@...ux.intel.com>
Cc:	LKP <lkp@...org>, LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Pekka Enberg <penberg@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, Andi Kleen <andi@...stfloor.org>,
	Dave Hansen <dave.hansen@...el.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Christoph Lameter <cl@...ux.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Konstantin Khlebnikov <koct9i@...il.com>,
	Yuri Gribov <tetra2005@...il.com>,
	Andrey Konovalov <adech.fo@...il.com>,
	Dmitry Chernenkov <dmitryc@...gle.com>,
	Konstantin Serebryany <kcc@...gle.com>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [lkp] [fs] df4c0e36f1: NMI watchdog: BUG: soft lockup - CPU#0
 stuck for 22s! [swapper/0:1]

2015-11-02 11:06 GMT+03:00 kernel test robot <ying.huang@...ux.intel.com>:
> FYI, we noticed the below changes on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> commit df4c0e36f1b1782b0611a77c52cc240e5c4752dd ("fs: dcache: manually unpoison dname after allocation to shut up kasan's reports")
>
>
> The commit fixed a KASan bug, but introduced or revealed a soft lockup
> issue as follow.
>

I think that this issue is caused by several factors.
 1. Non-preempt kernel + heavy debug options (CONFIG_PREEMPT=n
CONFIG_GCOV_PROFILE_ALL=y CONFIG_KASAN=y)
 2. I suspect that your host is overloaded, thus KVM guest runs too slow.

And the major factor here is number 2.

In your dmesg:
[   67.891156] rbtree testing -> 570841 cycles
[   88.609636] augmented rbtree testing
[  116.546697] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[swapper/0:1]


I've tried to reproduce this, and got this:
[    0.693574] rbtree testing -> 15513 cycles
570841/15513 = 36x times faster.

[    1.159450] augmented rbtree testing -> 23675 cycles
[    1.864996]
It took less than a second, meanwhile in your case it didn't finish in
22 seconds.

This makes me think that your host is overloaded and the problem is on
your side.

>
> [  116.546697] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
> [  116.546697] irq event stamp: 3018750
> [  116.546697] hardirqs last  enabled at (3018749): [<ffffffff8224f36c>] restore_args+0x0/0x30
> [  116.546697] hardirqs last disabled at (3018750): [<ffffffff8224f65d>] apic_timer_interrupt+0x6d/0x80
> [  116.546697] softirqs last  enabled at (3018748): [<ffffffff810de898>] __do_softirq+0x417/0x65b
> [  116.546697] softirqs last disabled at (3018743): [<ffffffff810ded30>] irq_exit+0x55/0xc3
> [  116.546697] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.19.0-05249-gdf4c0e3 #1
> [  116.546697] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
> [  116.546697] task: ffff88002f3f0000 ti: ffff88002f3f8000 task.ti: ffff88002f3f8000
> [  116.546697] RIP: 0010:[<ffffffff8125fdf9>]  [<ffffffff8125fdf9>] __asan_load4+0x12/0x91
> [  116.546697] RSP: 0000:ffff88002f3fbdc8  EFLAGS: 00000282
> [  116.546697] RAX: ffff7fffffffffff RBX: 0000000000000000 RCX: dffffc0000000000
> [  116.546697] RDX: fffffbfff0763087 RSI: ffffffff83b18b10 RDI: ffffffff83b1803c
> [  116.546697] RBP: ffff88002f3fbdf8 R08: 0000000000000001 R09: fffffbfff0763074
> [  116.546697] R10: ffffffff83b183a7 R11: fffffbfff0763074 R12: ffff880038d29003
> [  116.546697] R13: ffffed00071a5200 R14: ffffffff818cfadb R15: ffff88002f3fbdf8
> [  116.546697] FS:  0000000000000000(0000) GS:ffff88002f600000(0000) knlGS:0000000000000000
> [  116.546697] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  116.546697] CR2: 0000000000000000 CR3: 0000000002793000 CR4: 00000000000006f0
> [  116.546697] Stack:
> [  116.546697]  ffffffff83b18d5c ffffffff83b18b08 ffffffff83b18878 ffffffff83b18b10
> [  116.546697]  ffffffff83b18868 0000000000000000 ffff88002f3fbe48 ffffffff8223c4d3
> [  116.546697]  0000000000000000 c4400d6283b18d40 0000000000000202 0000000000000d98
> [  116.546697] Call Trace:
> [  116.546697]  [<ffffffff8223c4d3>] insert_augmented+0x52/0xf5
> [  116.546697]  [<ffffffff82b7b4ef>] rbtree_test_init+0x151/0x205
> [  116.546697]  [<ffffffff82b7b39e>] ? glob_init+0x11c/0x11c
> [  116.546697]  [<ffffffff82b222dc>] do_one_initcall+0x105/0x197
> [  116.546697]  [<ffffffff82b22501>] kernel_init_freeable+0x193/0x22e
> [  116.546697]  [<ffffffff8222ba4b>] ? rest_init+0x14d/0x14d
> [  116.546697]  [<ffffffff8222ba59>] kernel_init+0xe/0xdf
> [  116.546697]  [<ffffffff8224e5fc>] ret_from_fork+0x7c/0xb0
> [  116.546697]  [<ffffffff8222ba4b>] ? rest_init+0x14d/0x14d
> [  116.546697] Code: 38 00 74 13 48 8b 4d 08 ba 01 00 00 00 be 02 00 00 00 e8 e0 09 00 00 c9 c3 55 48 b8 ff ff ff ff ff 7f ff ff 48 89 e5 48 83 ec 30 <48> 39 c7 77 23 48 8b 45 08 48 89 7d d8 48 8d 7d d8 48 c7 45 e8
> [  116.546697] Kernel panic - not syncing: softlockup: hung tasks
> [  116.546697] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G             L  3.19.0-05249-gdf4c0e3 #1
> [  116.546697] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
> [  116.546697]  0000000000000000 0000000000000000 0000000000000000 ffff88002f603d48
> [  116.546697]  ffffffff8223bcf7 fffffbfff074360b ffffffff8263d4e2 ffff88002f603dc8
> [  116.546697]  ffffffff82234300 ffff88002f603dc8 ffffffff00000008 ffff88002f603dd8
> [  116.546697] Call Trace:
> [  116.546697]  <IRQ>  [<ffffffff8223bcf7>] dump_stack+0x4c/0x65
> [  116.546697]  [<ffffffff82234300>] panic+0xed/0x231
> [  116.546697]  [<ffffffff811adf9d>] watchdog_timer_fn+0x1d2/0x1f8
> [  116.546697]  [<ffffffff811addcb>] ? __touch_watchdog+0x1c/0x1c
> [  116.546697]  [<ffffffff8117083e>] __run_hrtimer+0x283/0x6a2
> [  116.546697]  [<ffffffff81171aaf>] hrtimer_run_queues+0x126/0x145
> [  116.546697]  [<ffffffff8116ff2f>] run_local_timers+0xe/0x1a
> [  116.546697]  [<ffffffff8116ff62>] update_process_times+0x27/0x54
> [  116.546697]  [<ffffffff81181e76>] tick_nohz_handler+0x101/0x164
> [  116.546697]  [<ffffffff81040dbc>] local_apic_timer_interrupt+0x78/0x80
> [  116.546697]  [<ffffffff8225144c>] smp_apic_timer_interrupt+0x57/0x69
> [  116.546697]  [<ffffffff8224f662>] apic_timer_interrupt+0x72/0x80
> [  116.546697]  <EOI>  [<ffffffff8125fdf9>] ? __asan_load4+0x12/0x91
> [  116.546697]  [<ffffffff8223c4d3>] insert_augmented+0x52/0xf5
> [  116.546697]  [<ffffffff82b7b4ef>] rbtree_test_init+0x151/0x205
> [  116.546697]  [<ffffffff82b7b39e>] ? glob_init+0x11c/0x11c
> [  116.546697]  [<ffffffff82b222dc>] do_one_initcall+0x105/0x197
> [  116.546697]  [<ffffffff82b22501>] kernel_init_freeable+0x193/0x22e
> [  116.546697]  [<ffffffff8222ba4b>] ? rest_init+0x14d/0x14d
> [  116.546697]  [<ffffffff8222ba59>] kernel_init+0xe/0xdf
> [  116.546697]  [<ffffffff8224e5fc>] ret_from_fork+0x7c/0xb0
> [  116.546697]  [<ffffffff8222ba4b>] ? rest_init+0x14d/0x14d
> [  116.546697] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
>
>
> Thanks,
> Ying Huang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/