[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5aeb303f.ZSZnAoV6lBIOMHs0%lkp@intel.com>
Date: Thu, 03 May 2018 23:52:31 +0800
From: kernel test robot <lkp@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKP <lkp@...org>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...nel.org>, wfg@...ux.intel.com
Subject: 85f1abe001 ("kthread, sched/wait: Fix kthread_parkme() .."):
WARNING: CPU: 0 PID: 1 at kernel/kthread.c:486 kthread_park
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/urgent
commit 85f1abe0019fcb3ea10df7029056cf42702283a8
Author: Peter Zijlstra <peterz@...radead.org>
AuthorDate: Tue May 1 18:14:45 2018 +0200
Commit: Ingo Molnar <mingo@...nel.org>
CommitDate: Thu May 3 07:38:05 2018 +0200
kthread, sched/wait: Fix kthread_parkme() completion issue
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@...eaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Ingo Molnar <mingo@...nel.org>
741a76b350 kthread, sched/wait: Fix kthread_parkme() wait-loop
85f1abe001 kthread, sched/wait: Fix kthread_parkme() completion issue
12bc056bd1 Merge branch 'sched/urgent'
+-------------------------------------------+------------+------------+------------+
| | 741a76b350 | 85f1abe001 | 12bc056bd1 |
+-------------------------------------------+------------+------------+------------+
| boot_successes | 44 | 0 | 0 |
| boot_failures | 0 | 26 | 22 |
| WARNING:at_kernel/kthread.c:#kthread_park | 0 | 26 | 22 |
| EIP:kthread_park | 0 | 26 | 22 |
+-------------------------------------------+------------+------------+------------+
[ 0.011005] CPU: GenuineIntel Intel Core Processor (Haswell) (family: 0x6, model: 0x3c, stepping: 0x4)
[ 0.012011] Spectre V2 : Spectre mitigation: kernel not compiled with retpoline; no mitigation available!
[ 0.013949] Performance Events: no PMU driver, software events only.
[ 0.019490] NMI watchdog: Perf event create on CPU 0 failed with -2
[ 0.020005] NMI watchdog: Perf NMI watchdog permanently disabled
[ 0.020692] WARNING: CPU: 0 PID: 1 at kernel/kthread.c:486 kthread_park+0x2b/0x56
[ 0.021000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.17.0-rc3-00038-g85f1abe #687
[ 0.021000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.021000] EIP: kthread_park+0x2b/0x56
[ 0.021000] EFLAGS: 00210202 CPU: 0
[ 0.021000] EAX: cf464d40 EBX: cf422500 ECX: 00000000 EDX: 00000004
[ 0.021000] ESI: c221e2c0 EDI: 00000000 EBP: cf42df68 ESP: cf42df60
[ 0.021000] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 0.021000] CR0: 80050033 CR2: ffffffff CR3: 022e8000 CR4: 00140690
[ 0.021000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.021000] DR6: fffe0ff0 DR7: 00000400
[ 0.021000] Call Trace:
[ 0.021000] smpboot_update_cpumask_percpu_thread+0x28/0x42
[ 0.021000] softlockup_update_smpboot_threads+0x37/0x39
[ 0.021000] lockup_detector_reconfigure+0x17/0x62
[ 0.021000] lockup_detector_init+0x5d/0x69
[ 0.021000] kernel_init_freeable+0x52/0x15c
[ 0.021000] ? rest_init+0xf4/0xf4
[ 0.021000] kernel_init+0x8/0xd0
[ 0.021000] ret_from_fork+0x19/0x30
[ 0.021000] Code: 55 89 e5 56 53 8b 50 14 0f ba e2 15 72 02 0f 0b 8b 98 44 03 00 00 80 e2 04 74 09 0f 0b be da ff ff ff eb 2c 8b 13 80 e2 04 74 09 <0f> 0b be f0 ff ff ff eb 1c 80 0b 04 8b 15 70 de f6 c1 31 f6 39
[ 0.021000] ---[ end trace 3a71adb42feecba7 ]---
[ 0.021060] TSC deadline timer enabled
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start d17cc3a1a1091797eff4a671659c15f2dc667996 6d08b06e67cd117f6992c46611dfb4ce267cd71e --
git bisect good 99309e94100238f4e0d1d6bdc31f4034587d5fb9 # 19:13 G 11 0 0 1 Merge 'linux-review/Bj-rn-Mork/qmi_wwan-do-not-steal-interfaces-from-class-drivers/20180503-164137' into devel-catchup-201805031730
git bisect bad 067517401c4ab83974fb63d7c4a98eb82791d15f # 19:22 B 0 3 27 11 Merge 'tip/sched/urgent' into devel-catchup-201805031730
git bisect good 7ff5000268355c63dc948ecb01f4de17987586e5 # 19:35 G 11 0 0 1 Merge tag 'sound-4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 0d95cfa922c24bcc20b5ccf7496b6ac7c8e29efb # 20:44 G 11 0 0 0 Merge tag 'powerpc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good 65f4d6d0f80b3c55830ec5735194703fa2909ba1 # 20:58 G 11 0 0 1 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 2d618bdf71635463a4aa4ad0fe46ec852292bc0c # 21:00 G 10 0 0 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel
git bisect good ecd649b3408408841d5793038b0241e55ac7a141 # 21:09 G 11 0 0 0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect good dcf234577cd31fa16874e828b90659166ad6b80d # 21:20 G 11 0 0 0 tracing: Add field modifier parsing hist error for hist triggers
git bisect good 0b26351b910fb8fe6a056f8a1bbccabe50c0e19f # 21:35 G 11 0 0 1 stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
git bisect good 741a76b350897604c48fb12beff1c9b77724dc96 # 21:51 G 11 0 0 2 kthread, sched/wait: Fix kthread_parkme() wait-loop
git bisect bad 85f1abe0019fcb3ea10df7029056cf42702283a8 # 22:03 B 0 11 35 11 kthread, sched/wait: Fix kthread_parkme() completion issue
# first bad commit: [85f1abe0019fcb3ea10df7029056cf42702283a8] kthread, sched/wait: Fix kthread_parkme() completion issue
git bisect good 741a76b350897604c48fb12beff1c9b77724dc96 # 22:06 G 31 0 0 2 kthread, sched/wait: Fix kthread_parkme() wait-loop
# extra tests with debug options
git bisect bad 85f1abe0019fcb3ea10df7029056cf42702283a8 # 22:20 B 0 3 16 0 kthread, sched/wait: Fix kthread_parkme() completion issue
# extra tests on HEAD of linux-devel/devel-catchup-201805031730
git bisect bad d17cc3a1a1091797eff4a671659c15f2dc667996 # 22:25 B 0 13 29 0 0day head guard for 'devel-catchup-201805031730'
# extra tests on tree/branch tip/sched/urgent
git bisect bad 85f1abe0019fcb3ea10df7029056cf42702283a8 # 22:33 B 0 26 39 0 kthread, sched/wait: Fix kthread_parkme() completion issue
# extra tests with first bad commit reverted
git bisect good 8058ae7fe7b61a67f8a7da7013e4c3d7fa5c0ba8 # 23:39 G 11 0 0 1 Revert "kthread, sched/wait: Fix kthread_parkme() completion issue"
# extra tests on tree/branch tip/master
git bisect bad 12bc056bd1170303f849954e8e566488acfc7202 # 23:51 B 0 11 35 11 Merge branch 'sched/urgent'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
Download attachment "dmesg-openwrt-lkp-hsw01-103:20180503220225:i386-randconfig-i1-201817:4.17.0-rc3-00038-g85f1abe:687.gz" of type "application/gzip" (17082 bytes)
Download attachment "dmesg-yocto-ivb41-33:20180503215015:i386-randconfig-i1-201817:4.17.0-rc3-00037-g741a76b:686.gz" of type "application/gzip" (31567 bytes)
View attachment "reproduce-openwrt-lkp-hsw01-103:20180503220225:i386-randconfig-i1-201817:4.17.0-rc3-00038-g85f1abe:687" of type "text/plain" (918 bytes)
View attachment "config-4.17.0-rc3-00038-g85f1abe" of type "text/plain" (116024 bytes)
Powered by blists - more mailing lists