[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150714000910.GA8160@wfg-t540p.sh.intel.com>
Date: Tue, 14 Jul 2015 08:09:10 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Mel Gorman <mgorman@...e.de>
Cc: fengguang.wu@...el.com, Andrew Morton <akpm@...ux-foundation.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-kernel@...r.kernel.org, LKP <lkp@...org>
Subject: [mminit] [ INFO: possible recursive locking detected ]
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit 0e1cc95b4cc7293bb7b39175035e7f7e45c90977
Author: Mel Gorman <mgorman@...e.de>
AuthorDate: Tue Jun 30 14:57:27 2015 -0700
Commit: Linus Torvalds <torvalds@...ux-foundation.org>
CommitDate: Tue Jun 30 19:44:56 2015 -0700
mm: meminit: finish initialisation of struct pages before basic setup
Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred. One approach is to initialise
memory on demand but it interferes with page allocator paths. This patch
creates dedicated threads to initialise memory before basic setup. It
then blocks on a rw_semaphore until completion as a wait_queue and counter
is overkill. This may be slower to boot but it's simplier overall and
also gets rid of a section mangling which existed so kswapd could do the
initialisation.
[akpm@...ux-foundation.org: include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast]
Signed-off-by: Mel Gorman <mgorman@...e.de>
Cc: Waiman Long <waiman.long@...com
Cc: Nathan Zimmer <nzimmer@....com>
Cc: Dave Hansen <dave.hansen@...el.com>
Cc: Scott Norton <scott.norton@...com>
Tested-by: Daniel J Blueman <daniel@...ascale.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
+-----------------------------------------------------+------------+------------+-----------------+
| | 74033a798f | 0e1cc95b4c | v4.2-rc1_071220 |
+-----------------------------------------------------+------------+------------+-----------------+
| boot_successes | 0 | 0 | 0 |
| boot_failures | 132 | 35 | 13 |
| kernel_BUG_at_include/linux/mtd/map.h | 132 | 35 | 13 |
| invalid_opcode | 132 | 35 | 13 |
| RIP:mtd_do_chip_probe | 132 | 35 | 13 |
| Kernel_panic-not_syncing:Fatal_exception | 132 | 35 | 13 |
| backtrace:do_map_probe | 132 | 35 | 13 |
| backtrace:init_sbc_gxx | 132 | 35 | 13 |
| backtrace:kernel_init_freeable | 132 | 35 | 13 |
| INFO:possible_recursive_locking_detected | 0 | 16 | 13 |
| backtrace:page_alloc_init_late | 0 | 16 | 13 |
| backtrace:down_write | 0 | 16 | 13 |
| WARNING:at_kernel/locking/lockdep.c:#lock_release() | 0 | 19 | |
| backtrace:up_read | 0 | 19 | |
| backtrace:deferred_init_memmap | 0 | 19 | |
+-----------------------------------------------------+------------+------------+-----------------+
Attached parent dmesg too, which looks like an independent bug.
[ 0.084000] ..... host bus clock speed is 1000.0062 MHz.
[ 0.084323]
[ 0.084537] =============================================
[ 0.085229] [ INFO: possible recursive locking detected ]
[ 0.085913] 4.1.0-11369-g0e1cc95b4 #5 Not tainted
[ 0.086524] ---------------------------------------------
[ 0.087224] swapper/1 is trying to acquire lock:
[ 0.087839] (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe9c>] page_alloc_init_late+0x7f/0x90
[ 0.088000]
[ 0.088000] but task is already holding lock:
[ 0.088000] (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe30>] page_alloc_init_late+0x13/0x90
[ 0.088000]
[ 0.088000] other info that might help us debug this:
[ 0.088000] Possible unsafe locking scenario:
[ 0.088000]
[ 0.088000] CPU0
[ 0.088000] ----
[ 0.088000] lock(pgdat_init_rwsem);
[ 0.088000] lock(pgdat_init_rwsem);
[ 0.088000]
[ 0.088000] *** DEADLOCK ***
[ 0.088000]
[ 0.088000] May be due to missing lock nesting notation
[ 0.088000]
[ 0.088000] 1 lock held by swapper/1:
[ 0.088000] #0: (pgdat_init_rwsem){++++.+}, at: [<ffffffff82cebe30>] page_alloc_init_late+0x13/0x90
[ 0.088000]
[ 0.088000] stack backtrace:
[ 0.088000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.1.0-11369-g0e1cc95b4 #5
[ 0.088000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 0.088000] ffffffff83591d60 ffff880010ee7d78 ffffffff81a4cb82 ffff880010ee7e48
[ 0.088000] ffffffff810fb38c ffff880010ee7db8 00000000810e53f7 ffff880010ef0c70
[ 0.088000] 0000000000000000 ffffffff83591d60 ffffffff834c0c00 00000000004b425a
[ 0.088000] Call Trace:
[ 0.088000] [<ffffffff81a4cb82>] dump_stack+0x19/0x1b
[ 0.088000] [<ffffffff810fb38c>] __lock_acquire+0xe3b/0xfeb
[ 0.088000] [<ffffffff815910be>] ? check_preemption_disabled+0x3c/0x196
[ 0.088000] [<ffffffff810fbac2>] lock_acquire+0x10e/0x198
[ 0.088000] [<ffffffff82cebe9c>] ? page_alloc_init_late+0x7f/0x90
[ 0.088000] [<ffffffff81a587ab>] down_write+0x3d/0x8b
[ 0.088000] [<ffffffff82cebe9c>] ? page_alloc_init_late+0x7f/0x90
[ 0.088000] [<ffffffff82cebe9c>] page_alloc_init_late+0x7f/0x90
[ 0.088000] [<ffffffff82cc476d>] kernel_init_freeable+0x180/0x2c9
[ 0.088000] [<ffffffff81a39fce>] ? rest_init+0x155/0x155
[ 0.088000] [<ffffffff81a39fd7>] kernel_init+0x9/0x152
[ 0.088000] [<ffffffff81a5b0cf>] ret_from_fork+0x3f/0x70
[ 0.088000] [<ffffffff81a39fce>] ? rest_init+0x155/0x155
[ 0.088611] devtmpfs: initialized
[ 0.098991] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.100542] xor: measuring software checksum speed
git bisect start d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754 v4.1 --
git bisect good e382608254e06c8109f40044f5e693f2e04f3899 # 22:59 22+ 22 Merge tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
git bisect bad 5f1201d515819e7cfaaac3f0a30ff7b556261386 # 23:27 1- 20 Merge tag 'clk-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
git bisect good 88793e5c774ec69351ef6b5200bb59f532e41bca # 23:36 22+ 22 Merge tag 'libnvdimm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm
git bisect good 7adf12b87f45a77d364464018fb8e9e1ac875152 # 23:41 22+ 22 Merge tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect good 8fff77551a9215a725650263e30fa105acca95ab # 23:45 20+ 20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect bad 2d01eedf1d14432f4db5388a49dc5596a8c5bd02 # 23:50 1- 2 Merge branch 'akpm' (patches from Andrew)
git bisect good d5fb82137b6cd39e67c4321f4f5ce9b03d4d04e6 # 00:16 33+ 35 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 6ac15baacb6ecd87c66209627753b96ded3b4515 # 00:21 33+ 33 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 9ce71148b027e2bd27016139cae1c39401587695 # 00:25 1- 5 devpts: if initialization failed, don't crash when opening /dev/ptmx
git bisect bad 460b865e53c347ebf110e50d499718cd9b39d810 # 00:30 5- 33 fs: document seq_open()'s usage of file->private_data
git bisect good 7e18adb4f80bea90d30b62158694d97c31f71d37 # 00:35 33+ 33 mm: meminit: initialise remaining struct pages in parallel with kswapd
git bisect good ac5d2539b2382689b1cdb90bd60dcd49f61c2773 # 00:41 31+ 31 mm: meminit: reduce number of times pageblocks are set during struct page init
git bisect bad 0e1cc95b4cc7293bb7b39175035e7f7e45c90977 # 00:45 11- 24 mm: meminit: finish initialisation of struct pages before basic setup
git bisect good 74033a798f5a5db368126ee6f690111cf019bf7a # 00:48 32+ 32 mm: meminit: remove mminit_verify_page_links
# first bad commit: [0e1cc95b4cc7293bb7b39175035e7f7e45c90977] mm: meminit: finish initialisation of struct pages before basic setup
git bisect good 74033a798f5a5db368126ee6f690111cf019bf7a # 00:51 100+ 132 mm: meminit: remove mminit_verify_page_links
# extra tests with DEBUG_INFO
git bisect bad 0e1cc95b4cc7293bb7b39175035e7f7e45c90977 # 00:54 0- 45 mm: meminit: finish initialisation of struct pages before basic setup
# extra tests on HEAD of linux-devel/devel-hourly-2015071220
git bisect bad 1ae922e305feca3d8af890cf4601ef6a6cb5bbf1 # 00:54 0- 13 0day head guard for 'devel-hourly-2015071220'
# extra tests on tree/branch linus/master
git bisect bad bc0195aad0daa2ad5b0d76cce22b167bc3435590 # 00:58 4- 45 Linux 4.2-rc2
# extra tests with first bad commit reverted
git bisect good 44813dd2ca45b1917d85ba59197678fdf069ce76 # 01:06 99+ 99 Revert "mm: meminit: finish initialisation of struct pages before basic setup"
# extra tests on tree/branch linus/master
git bisect bad bc0195aad0daa2ad5b0d76cce22b167bc3435590 # 01:06 0- 99 Linux 4.2-rc2
# extra tests on tree/branch next/master
git bisect bad 2eb62d762a2112579f259903e62ba18d16c51f66 # 01:17 3- 23 Add linux-next specific files for 20150713
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-x86_64.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-initrd $initrd
-m 300
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
View attachment "dmesg-quantal-ivb41-143:20150714004438:x86_64-randconfig-a0-07122022:4.1.0-11369-g0e1cc95b4:5" of type "text/plain" (42216 bytes)
View attachment "dmesg-quantal-intel12-10:20150714005040:x86_64-randconfig-a0-07122022:4.1.0-11368-g74033a7:2" of type "text/plain" (39720 bytes)
View attachment "config-4.1.0-11369-g0e1cc95b4" of type "text/plain" (73196 bytes)
Powered by blists - more mailing lists