[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150930223637.GA2839@wfg-t540p.sh.intel.com>
Date: Thu, 1 Oct 2015 06:36:37 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Chris Metcalf <cmetcalf@...hip.com>
Cc: 0day robot <fengguang.wu@...el.com>, LKP <lkp@...org>,
LKML <linux-kernel@...r.kernel.org>
Subject: [arch/x86] INFO: task swapper:1 blocked for more than 120 seconds.
Hi Chris,
We find boot errors when testing your patch on top of 4.3-rc3:
commit 2657eee793e8b13334860e7953d5aa6e49227521
Author: Chris Metcalf <cmetcalf@...hip.com>
AuthorDate: Mon Sep 28 11:17:22 2015 -0400
Commit: 0day robot <fengguang.wu@...el.com>
CommitDate: Mon Sep 28 23:23:55 2015 +0800
arch/x86: enable task isolation functionality
In prepare_exit_to_usermode(), we would like to call
task_isolation_enter() on every return to userspace, and like
other work items, we would like to recheck for more work after
calling it, since it will enable interrupts internally.
However, if task_isolation_enter() is the only work item,
and it has already been called once, we don't want to continue
calling it in a loop. We don't have a dedicated TIF flag for
task isolation, and it wouldn't make sense to have one, since
we'd want to set it before starting exit every time, and then
clear it the first time around the loop.
Instead, we change the loop structure somewhat, so that we
have a more inclusive set of flags that are tested for on the
first entry to the function (including TIF_NOHZ), and if any
of those flags are set, we enter the loop. And, we do the
task_isolation() test unconditionally at the bottom of the loop,
but then when making the decision to loop back, we just use the
set of flags that doesn't include TIF_NOHZ. That way we only
loop if there is other work to do, but then if that work
is done, we again unconditionally call task_isolation_enter().
In syscall_trace_enter_phase1(), we try to add the necessary
support for strict-mode detection of syscalls in an optimized
way, by letting the code remain unchanged if we are not using
TASK_ISOLATION, but otherwise calling enter_from_user_mode()
under the first time we see _TIF_NOHZ, and then waiting until
after we do the secure computing work to actually clear the bit
from the "work" variable and call task_isolation_syscall().
Signed-off-by: Chris Metcalf <cmetcalf@...hip.com>
+--------------------------------------------------+------------+------------+------------+
| | 5f7bb45a98 | 2657eee793 | 0e63f8ed08 |
+--------------------------------------------------+------------+------------+------------+
| boot_successes | 103 | 0 | 0 |
| boot_failures | 2 | 15 | 19 |
| IP-Config:Auto-configuration_of_network_failed | 2 | | |
| INFO:task_blocked_for_more_than#seconds | 0 | 15 | 19 |
| BUG:kernel_boot_hang | 0 | 4 | 2 |
| backtrace:platform_device_add | 0 | 15 | 19 |
| backtrace:uvesafb_init | 0 | 15 | 19 |
| backtrace:kernel_init_freeable | 0 | 15 | 19 |
| EIP_is_at_default_send_IPI_mask_logical | 0 | 11 | 17 |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 11 | 17 |
| backtrace:watchdog | 0 | 11 | 17 |
+--------------------------------------------------+------------+------------+------------+
[ 122.507643] Writes: Total: 2 Max/Min: 0/0 Fail: 0
[ 182.516459] Writes: Total: 2 Max/Min: 0/0 Fail: 0
[ 182.516459] Writes: Total: 2 Max/Min: 0/0 Fail: 0
[ 241.690928] INFO: task swapper:1 blocked for more than 120 seconds.
[ 241.690928] INFO: task swapper:1 blocked for more than 120 seconds.
[ 241.693206] Not tainted 4.3.0-rc3-00007-g2657eee #1
[ 241.693206] Not tainted 4.3.0-rc3-00007-g2657eee #1
[ 241.697586] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
git bisect start 0e63f8ed0801de6528279fe49b7943f3a937816a 9ffecb10283508260936b96022d4ee43a7798b4c --
git bisect good 23a6580b5d9ab92b3bc6bdd7f0d6814c3e03d54b # 07:31 21+ 0 Merge 'linux-review/Tomasz-Figa/iommu-Add-range-flush-operation' into devel-hourly-2015093003
git bisect good 6d0672627097bed3c8ddee205685448d7206e6f9 # 07:37 22+ 0 Merge 'linux-review/Sebastian-Ott/misc-genwqe-get-rid-of-atomic-allocations' into devel-hourly-2015093003
git bisect good 9bc43d66a6c060de7edd252c415102b19326f91b # 07:43 22+ 0 Merge 'linux-review/Laurentiu-Tudor/KVM-PPC-e6500-allow-odd-powers-of-2K-TLB1-sizes' into devel-hourly-2015093003
git bisect good 5bec516af70bd1ae8c13cc4b02887f1b2d8c951b # 07:50 22+ 0 Merge 'linux-review/Javier-Martinez-Canillas/mfd-Simplify-return-logic' into devel-hourly-2015093003
git bisect bad 8b92195248da5a886956159c71b58dc66c562d87 # 08:01 0- 23 Merge 'linux-review/BryanSPaul/staging-rtl8188eu-style-fix-comparisons-moved-to-right' into devel-hourly-2015093003
git bisect bad 6cd8866c82bf94c11e603636b8479c828ecb3424 # 08:01 0- 13 Merge 'linux-review/Chris-Metcalf/support-task_isolated-mode-for-nohz_full' into devel-hourly-2015093003
git bisect good 3456bf34dfcb948e8ae9d1d58ea42e186e8ebb73 # 08:03 22+ 2 Merge 'linux-review/Javier-Martinez-Canillas/ARM-exynos_defconfig-Enable-WiFi-Ex-as-a-module-instead-built-in' into devel-hourly-2015093003
git bisect good 5f7bb45a98bfdc14fb24f44c945e01d5e5b15c90 # 10:44 22+ 2 nohz: task_isolation: allow tick to be fully disabled
git bisect bad 1be296341e2b8d647e112593b6bc5315b2016d74 # 11:32 0- 19 arch/arm64: enable task isolation functionality
git bisect bad 77245ab1e4afae5fc2d49c586ccd2aa64976977c # 11:32 0- 4 arch/arm64: adopt prepare_exit_to_usermode() model from x86
git bisect bad 2657eee793e8b13334860e7953d5aa6e49227521 # 11:32 0- 15 arch/x86: enable task isolation functionality
# first bad commit: [2657eee793e8b13334860e7953d5aa6e49227521] arch/x86: enable task isolation functionality
git bisect good 5f7bb45a98bfdc14fb24f44c945e01d5e5b15c90 # 11:51 66+ 2 nohz: task_isolation: allow tick to be fully disabled
# extra tests with DEBUG_INFO
git bisect bad 2657eee793e8b13334860e7953d5aa6e49227521 # 12:06 0- 16 arch/x86: enable task isolation functionality
# extra tests on HEAD of linux-devel/devel-hourly-2015093003
git bisect bad a19ae5a56b8f2562ca1f96374e8786d9a345937c # 12:06 0- 35 0day head guard for 'devel-hourly-2015093003'
# extra tests on tree/branch linux-devel/revert-a580b73412da93a2194037e54342980f2452520d-2657eee793e8b13334860e7953d5aa6e49227521
git bisect good aff461182c86ee8ccc57bd8843bf7ebd0ba6bf94 # 12:33 66+ 0 Revert "arch/x86: enable task isolation functionality"
# extra tests on tree/branch linus/master
git bisect good 3225031fbeb1e32b269a82eccd815128267a4bfe # 12:56 66+ 0 Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
# extra tests on tree/branch linux-next/master
git bisect good 0293645856ec527639b5902f021fa5aeba93e305 # 13:04 66+ 2 Add linux-next specific files for 20150929
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=yocto-minimal-i386.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu Haswell,+smep,+smap
-kernel $kernel
-initrd $initrd
-m 256
-smp 1
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
View attachment "dmesg-yocto-ivb41-53:20150930082054:i386-randconfig-x0-09300421:4.3.0-rc3-00007-g2657eee:1" of type "text/plain" (87641 bytes)
View attachment ".config" of type "text/plain" (91587 bytes)
Powered by blists - more mailing lists