lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150714000910.GA8160@wfg-t540p.sh.intel.com>
Date:	Tue, 14 Jul 2015 08:09:10 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Mel Gorman <mgorman@...e.de>
Cc:	fengguang.wu@...el.com, Andrew Morton <akpm@...ux-foundation.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	linux-kernel@...r.kernel.org, LKP <lkp@...org>
Subject: [mminit] [ INFO: possible recursive locking detected ]

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit 0e1cc95b4cc7293bb7b39175035e7f7e45c90977
Author:     Mel Gorman <mgorman@...e.de>
AuthorDate: Tue Jun 30 14:57:27 2015 -0700
Commit:     Linus Torvalds <torvalds@...ux-foundation.org>
CommitDate: Tue Jun 30 19:44:56 2015 -0700

    mm: meminit: finish initialisation of struct pages before basic setup
    
    Waiman Long reported that 24TB machines hit OOM during basic setup when
    struct page initialisation was deferred.  One approach is to initialise
    memory on demand but it interferes with page allocator paths.  This patch
    creates dedicated threads to initialise memory before basic setup.  It
    then blocks on a rw_semaphore until completion as a wait_queue and counter
    is overkill.  This may be slower to boot but it's simplier overall and
    also gets rid of a section mangling which existed so kswapd could do the
    initialisation.
    
    [akpm@...ux-foundation.org: include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast]
    Signed-off-by: Mel Gorman <mgorman@...e.de>
    Cc: Waiman Long <waiman.long@...com
    Cc: Nathan Zimmer <nzimmer@....com>
    Cc: Dave Hansen <dave.hansen@...el.com>
    Cc: Scott Norton <scott.norton@...com>
    Tested-by: Daniel J Blueman <daniel@...ascale.com>
    Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>

+-----------------------------------------------------+------------+------------+-----------------+
|                                                     | 74033a798f | 0e1cc95b4c | v4.2-rc1_071220 |
+-----------------------------------------------------+------------+------------+-----------------+
| boot_successes                                      | 0          | 0          | 0               |
| boot_failures                                       | 132        | 35         | 13              |
| kernel_BUG_at_include/linux/mtd/map.h               | 132        | 35         | 13              |
| invalid_opcode                                      | 132        | 35         | 13              |
| RIP:mtd_do_chip_probe                               | 132        | 35         | 13              |
| Kernel_panic-not_syncing:Fatal_exception            | 132        | 35         | 13              |
| backtrace:do_map_probe                              | 132        | 35         | 13              |
| backtrace:init_sbc_gxx                              | 132        | 35         | 13              |
| backtrace:kernel_init_freeable                      | 132        | 35         | 13              |
| INFO:possible_recursive_locking_detected            | 0          | 16         | 13              |
| backtrace:page_alloc_init_late                      | 0          | 16         | 13              |
| backtrace:down_write                                | 0          | 16         | 13              |
| WARNING:at_kernel/locking/lockdep.c:#lock_release() | 0          | 19         |                 |
| backtrace:up_read                                   | 0          | 19         |                 |
| backtrace:deferred_init_memmap                      | 0          | 19         |                 |
+-----------------------------------------------------+------------+------------+-----------------+

Attached parent dmesg too, which looks like an independent bug.

[    0.084000] ..... host bus clock speed is 1000.0062 MHz.
[    0.084323] 
[    0.084537] =============================================
[    0.085229] [ INFO: possible recursive locking detected ]
[    0.085913] 4.1.0-11369-g0e1cc95b4 #5 Not tainted
[    0.086524] ---------------------------------------------
[    0.087224] swapper/1 is trying to acquire lock:
[    0.087839]  (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe9c>] page_alloc_init_late+0x7f/0x90
[    0.088000] 
[    0.088000] but task is already holding lock:
[    0.088000]  (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe30>] page_alloc_init_late+0x13/0x90
[    0.088000] 
[    0.088000] other info that might help us debug this:
[    0.088000]  Possible unsafe locking scenario:
[    0.088000] 
[    0.088000]        CPU0
[    0.088000]        ----
[    0.088000]   lock(pgdat_init_rwsem);
[    0.088000]   lock(pgdat_init_rwsem);
[    0.088000] 
[    0.088000]  *** DEADLOCK ***
[    0.088000] 
[    0.088000]  May be due to missing lock nesting notation
[    0.088000] 
[    0.088000] 1 lock held by swapper/1:
[    0.088000]  #0:  (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe30>] page_alloc_init_late+0x13/0x90
[    0.088000] 
[    0.088000] stack backtrace:
[    0.088000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.1.0-11369-g0e1cc95b4 #5
[    0.088000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[    0.088000]  ffffffff83591d60 ffff880010ee7d78 ffffffff81a4cb82 ffff880010ee7e48
[    0.088000]  ffffffff810fb38c ffff880010ee7db8 00000000810e53f7 ffff880010ef0c70
[    0.088000]  0000000000000000 ffffffff83591d60 ffffffff834c0c00 00000000004b425a
[    0.088000] Call Trace:
[    0.088000]  [<ffffffff81a4cb82>] dump_stack+0x19/0x1b
[    0.088000]  [<ffffffff810fb38c>] __lock_acquire+0xe3b/0xfeb
[    0.088000]  [<ffffffff815910be>] ? check_preemption_disabled+0x3c/0x196
[    0.088000]  [<ffffffff810fbac2>] lock_acquire+0x10e/0x198
[    0.088000]  [<ffffffff82cebe9c>] ? page_alloc_init_late+0x7f/0x90
[    0.088000]  [<ffffffff81a587ab>] down_write+0x3d/0x8b
[    0.088000]  [<ffffffff82cebe9c>] ? page_alloc_init_late+0x7f/0x90
[    0.088000]  [<ffffffff82cebe9c>] page_alloc_init_late+0x7f/0x90
[    0.088000]  [<ffffffff82cc476d>] kernel_init_freeable+0x180/0x2c9
[    0.088000]  [<ffffffff81a39fce>] ? rest_init+0x155/0x155
[    0.088000]  [<ffffffff81a39fd7>] kernel_init+0x9/0x152
[    0.088000]  [<ffffffff81a5b0cf>] ret_from_fork+0x3f/0x70
[    0.088000]  [<ffffffff81a39fce>] ? rest_init+0x155/0x155
[    0.088611] devtmpfs: initialized
[    0.098991] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.100542] xor: measuring software checksum speed

git bisect start d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754 v4.1 --
git bisect good e382608254e06c8109f40044f5e693f2e04f3899  # 22:59     22+     22  Merge tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
git bisect  bad 5f1201d515819e7cfaaac3f0a30ff7b556261386  # 23:27      1-     20  Merge tag 'clk-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
git bisect good 88793e5c774ec69351ef6b5200bb59f532e41bca  # 23:36     22+     22  Merge tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm
git bisect good 7adf12b87f45a77d364464018fb8e9e1ac875152  # 23:41     22+     22  Merge tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect good 8fff77551a9215a725650263e30fa105acca95ab  # 23:45     20+     20  Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect  bad 2d01eedf1d14432f4db5388a49dc5596a8c5bd02  # 23:50      1-      2  Merge branch 'akpm' (patches from Andrew)
git bisect good d5fb82137b6cd39e67c4321f4f5ce9b03d4d04e6  # 00:16     33+     35  Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 6ac15baacb6ecd87c66209627753b96ded3b4515  # 00:21     33+     33  Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect  bad 9ce71148b027e2bd27016139cae1c39401587695  # 00:25      1-      5  devpts: if initialization failed, don't crash when opening /dev/ptmx
git bisect  bad 460b865e53c347ebf110e50d499718cd9b39d810  # 00:30      5-     33  fs: document seq_open()'s usage of file->private_data
git bisect good 7e18adb4f80bea90d30b62158694d97c31f71d37  # 00:35     33+     33  mm: meminit: initialise remaining struct pages in parallel with kswapd
git bisect good ac5d2539b2382689b1cdb90bd60dcd49f61c2773  # 00:41     31+     31  mm: meminit: reduce number of times pageblocks are set during struct page init
git bisect  bad 0e1cc95b4cc7293bb7b39175035e7f7e45c90977  # 00:45     11-     24  mm: meminit: finish initialisation of struct pages before basic setup
git bisect good 74033a798f5a5db368126ee6f690111cf019bf7a  # 00:48     32+     32  mm: meminit: remove mminit_verify_page_links
# first bad commit: [0e1cc95b4cc7293bb7b39175035e7f7e45c90977] mm: meminit: finish initialisation of struct pages before basic setup
git bisect good 74033a798f5a5db368126ee6f690111cf019bf7a  # 00:51    100+    132  mm: meminit: remove mminit_verify_page_links
# extra tests with DEBUG_INFO
git bisect  bad 0e1cc95b4cc7293bb7b39175035e7f7e45c90977  # 00:54      0-     45  mm: meminit: finish initialisation of struct pages before basic setup
# extra tests on HEAD of linux-devel/devel-hourly-2015071220
git bisect  bad 1ae922e305feca3d8af890cf4601ef6a6cb5bbf1  # 00:54      0-     13  0day head guard for 'devel-hourly-2015071220'
# extra tests on tree/branch linus/master
git bisect  bad bc0195aad0daa2ad5b0d76cce22b167bc3435590  # 00:58      4-     45  Linux 4.2-rc2
# extra tests with first bad commit reverted
git bisect good 44813dd2ca45b1917d85ba59197678fdf069ce76  # 01:06     99+     99  Revert "mm: meminit: finish initialisation of struct pages before basic setup"
# extra tests on tree/branch linus/master
git bisect  bad bc0195aad0daa2ad5b0d76cce22b167bc3435590  # 01:06      0-     99  Linux 4.2-rc2
# extra tests on tree/branch next/master
git bisect  bad 2eb62d762a2112579f259903e62ba18d16c51f66  # 01:17      3-     23  Add linux-next specific files for 20150713


This script may reproduce the error.

----------------------------------------------------------------------------
#!/bin/bash

kernel=$1
initrd=quantal-core-x86_64.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
	qemu-system-x86_64
	-enable-kvm
	-cpu kvm64
	-kernel $kernel
	-initrd $initrd
	-m 300
	-smp 2
	-device e1000,netdev=net0
	-netdev user,id=net0
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-rtc base=localtime
	-serial stdio
	-display none
	-monitor null 
)

append=(
	hung_task_panic=1
	earlyprintk=ttyS0,115200
	systemd.log_level=err
	debug
	apic=debug
	sysrq_always_enabled
	rcupdate.rcu_cpu_stall_timeout=100
	panic=-1
	softlockup_panic=1
	nmi_watchdog=panic
	oops=panic
	load_ramdisk=2
	prompt_ramdisk=0
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
	drbd.minor_count=8
)

"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/lkp                          Intel Corporation

View attachment "dmesg-quantal-ivb41-143:20150714004438:x86_64-randconfig-a0-07122022:4.1.0-11369-g0e1cc95b4:5" of type "text/plain" (42216 bytes)

View attachment "dmesg-quantal-intel12-10:20150714005040:x86_64-randconfig-a0-07122022:4.1.0-11368-g74033a7:2" of type "text/plain" (39720 bytes)

View attachment "config-4.1.0-11369-g0e1cc95b4" of type "text/plain" (73196 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ