lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 10 Nov 2013 18:16:12 +0800 From: Fengguang Wu <fengguang.wu@...el.com> To: Peter Zijlstra <peterz@...radead.org> Cc: Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org, Michael wang <wangyun@...ux.vnet.ibm.com> Subject: [sched/get_online_cpus] INFO: task swapper/0:1 blocked for more than 120 seconds. Greetings, I got the below dmesg and the first bad commit is commit 6acce3ef84520537f8a09a12c9ddbe814a584dd2 Author: Peter Zijlstra <peterz@...radead.org> Date: Fri Oct 11 14:38:20 2013 +0200 sched: Remove get_online_cpus() usage Remove get_online_cpus() usage from the scheduler; there's 4 sites that use it: - sched_init_smp(); where its completely superfluous since we're in 'early' boot and there simply cannot be any hotplugging. - sched_getaffinity(); we already take a raw spinlock to protect the task cpus_allowed mask, this disables preemption and therefore also stabilizes cpu_online_mask as that's modified using stop_machine. However switch to active mask for symmetry with sched_setaffinity()/set_cpus_allowed_ptr(). We guarantee active mask stability by inserting sync_rcu/sched() into _cpu_down. - sched_setaffinity(); we don't appear to need get_online_cpus() either, there's two sites where hotplug appears relevant: * cpuset_cpus_allowed(); for the !cpuset case we use possible_mask, for the cpuset case we hold task_lock, which is a spinlock and thus for mainline disables preemption (might cause pain on RT). * set_cpus_allowed_ptr(); Holds all scheduler locks and thus has preemption properly disabled; also it already deals with hotplug races explicitly where it releases them. - migrate_swap(); we can make stop_two_cpus() do the heavy lifting for us with a little trickery. By adding a sync_sched/rcu() after the CPU_DOWN_PREPARE notifier we can provide preempt/rcu guarantees for cpu_active_mask. Use these to validate that both our cpus are active when queueing the stop work before we queue the stop_machine works for take_cpu_down(). Signed-off-by: Peter Zijlstra <peterz@...radead.org> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com> Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com> Cc: Mel Gorman <mgorman@...e.de> Cc: Rik van Riel <riel@...hat.com> Cc: Srikar Dronamraju <srikar@...ux.vnet.ibm.com> Cc: Andrea Arcangeli <aarcange@...hat.com> Cc: Johannes Weiner <hannes@...xchg.org> Cc: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Andrew Morton <akpm@...ux-foundation.org> Cc: Steven Rostedt <rostedt@...dmis.org> Cc: Oleg Nesterov <oleg@...hat.com> Link: http://lkml.kernel.org/r/20131011123820.GV3081@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@...nel.org> =================================================== PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT! =================================================== +------------------------------------------------------------------------------------+--------------+--------------+ | | 746023159c40 | 1918664d0b2b | +------------------------------------------------------------------------------------+--------------+--------------+ | good_boots | 594 | 7 | | has_kernel_error_warning | 6 | 12 | | BUG:kernel_early_hang_without_any_printk_output | 6 | | | INFO:task_blocked_for_more_than_seconds | 0 | 12 | | INFO:NMI_handler(arch_trigger_all_cpu_backtrace_handler)took_too_long_to_run:msecs | 0 | 12 | | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 12 | +------------------------------------------------------------------------------------+--------------+--------------+ [ 10.410377] EDD information not available. [ 10.412866] Unregister pv shared memory for cpu 0 [ 10.499606] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3 [ 241.206735] INFO: task swapper/0:1 blocked for more than 120 seconds. [ 241.209244] Not tainted 3.12.0-02098-g1918664 #98 [ 241.211290] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 241.214286] swapper/0 D ffff88000ec9e040 0 1 0 0x00000000 [ 241.217142] ffff88000eca1c10 0000000000000046 ffffffff826aca60 ffff88000ec9e040 [ 241.220200] ffff88000eca1fd8 0000000000012a00 0000000000012a00 ffff88000c9927c0 [ 241.223196] ffff88000ec9e040 ffff88000ec9e040 ffff88000eca1b60 0000000000000002 [ 241.226835] Call Trace: [ 241.227813] [<ffffffff8110e544>] ? __lock_acquire+0x1e74/0x1ef0 [ 241.230123] [<ffffffff8110e544>] ? __lock_acquire+0x1e74/0x1ef0 [ 241.232400] [<ffffffff81d39ea5>] schedule+0x55/0x60 [ 241.234345] [<ffffffff81d38efb>] schedule_timeout+0x2b/0x200 [ 241.236533] [<ffffffff8110a88e>] ? mark_held_locks+0x12e/0x150 [ 241.238845] [<ffffffff81d3e577>] ? _raw_spin_unlock_irq+0x27/0x60 [ 241.241356] [<ffffffff810f753d>] ? get_parent_ip+0xd/0x50 [ 241.243790] [<ffffffff810f7603>] ? preempt_count_sub+0x33/0x50 [ 241.246273] [<ffffffff81d3a9c7>] wait_for_common+0x167/0x1b0 [ 241.248715] [<ffffffff810f9470>] ? try_to_wake_up+0x330/0x330 [ 241.251291] [<ffffffff81124df0>] ? __call_rcu.constprop.42+0x3d0/0x3d0 [ 241.254145] [<ffffffff81d3ab48>] wait_for_completion+0x18/0x20 [ 241.256756] [<ffffffff8111ed16>] wait_rcu_gp+0x46/0x50 [ 241.259000] [<ffffffff8111eb30>] ? ftrace_raw_output_rcu_barrier+0x70/0x70 [ 241.262012] [<ffffffff811263dc>] synchronize_rcu+0x2c/0x30 [ 241.264437] [<ffffffff81d1bd82>] _cpu_down+0x2b2/0x340 [ 241.266760] [<ffffffff81d3afea>] ? mutex_lock_nested+0x3da/0x400 [ 241.269311] [<ffffffff8106c832>] ? cpu_hotplug_driver_lock+0x12/0x20 [ 241.272113] [<ffffffff81d1bec1>] cpu_down+0x31/0x50 [ 241.274282] [<ffffffff81d1aaa7>] _debug_hotplug_cpu+0x47/0xb0 [ 241.276853] [<ffffffff824251f3>] ? pci_iommu_alloc+0x6c/0x6c [ 241.279183] [<ffffffff82425200>] debug_hotplug_cpu+0xd/0x11 [ 241.281679] [<ffffffff8241ce54>] do_one_initcall+0x7e/0x11b [ 241.284140] [<ffffffff810e9a1d>] ? parse_args+0x1dd/0x300 [ 241.286665] [<ffffffff8241cfed>] kernel_init_freeable+0xfc/0x18d [ 241.289336] [<ffffffff8241c7ec>] ? do_early_param+0x8a/0x8a [ 241.291894] [<ffffffff81d1a8e0>] ? rest_init+0xc0/0xc0 [ 241.294233] [<ffffffff81d1a8e9>] kernel_init+0x9/0x180 [ 241.296494] [<ffffffff81d3f9cc>] ret_from_fork+0x7c/0xb0 [ 241.298917] [<ffffffff81d1a8e0>] ? rest_init+0xc0/0xc0 [ 241.303393] 3 locks held by swapper/0/1: [ 241.305121] #0: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff8106c832>] cpu_hotplug_driver_lock+0x12/0x20 [ 241.309967] #1: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81d1bead>] cpu_down+0x1d/0x50 [ 241.314045] #2: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810cca86>] cpu_hotplug_begin+0x26/0x60 [ 241.318328] sending NMI to all CPUs: [ 241.319960] NMI backtrace for cpu 1 [ 241.321503] CPU: 1 PID: 29 Comm: khungtaskd Not tainted 3.12.0-02098-g1918664 #98 [ 241.321618] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 241.321618] task: ffff88000cf8e780 ti: ffff88000cf90000 task.ti: ffff88000cf90000 [ 241.321618] RIP: 0010:[<ffffffff8107377d>] [<ffffffff8107377d>] flat_send_IPI_mask+0x8d/0xd0 [ 241.321618] RSP: 0000:ffff88000cf91db8 EFLAGS: 00010046 [ 241.321618] RAX: 0000000003000000 RBX: 0000000000000002 RCX: 0000000000000007 [ 241.321618] RDX: 0000000000000c00 RSI: 0000000000000001 RDI: 0000000000000300 [ 241.321618] RBP: ffff88000cf91dd8 R08: 0000000000000000 R09: 0000000000000000 [ 241.321618] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000246 [ 241.321618] R13: 0000000000000c00 R14: 0000000000000003 R15: 00000000000003ff [ 241.321618] FS: 0000000000000000(0000) GS:ffff88000f100000(0000) knlGS:0000000000000000 [ 241.321618] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 241.321618] CR2: 00000000ffffffff CR3: 0000000002206000 CR4: 00000000000006a0 [ 241.321618] Stack: [ 241.321618] 0000000000002710 ffff88000ec9e040 0000000000007fff ffff88000ec9e040 [ 241.321618] ffff88000cf91de8 ffffffff810737da ffff88000cf91e00 ffffffff81070a3f [ 241.321618] 0000000000000078 ffff88000cf91e68 ffffffff8113e1a8 ffffffff8113df4b [ 241.321618] Call Trace: [ 241.321618] [<ffffffff810737da>] flat_send_IPI_all+0x1a/0x70 [ 241.321618] [<ffffffff81070a3f>] arch_trigger_all_cpu_backtrace+0x5f/0xb0 [ 241.321618] [<ffffffff8113e1a8>] watchdog+0x2d8/0x3d0 [ 241.321618] [<ffffffff8113df4b>] ? watchdog+0x7b/0x3d0 [ 241.321618] [<ffffffff8113ded0>] ? hung_task_panic+0x20/0x20 [ 241.321618] [<ffffffff810eb431>] kthread+0xd1/0xe0 [ 241.321618] [<ffffffff81d3e577>] ? _raw_spin_unlock_irq+0x27/0x60 [ 241.321618] [<ffffffff810eb360>] ? __kthread_unpark+0x50/0x50 [ 241.321618] [<ffffffff81d3f9cc>] ret_from_fork+0x7c/0xb0 [ 241.321618] [<ffffffff810eb360>] ? __kthread_unpark+0x50/0x50 [ 241.321618] Code: e6 10 75 f2 44 89 f0 c1 e0 18 89 04 25 10 b3 57 ff 89 da 44 09 ea 41 81 cd 00 04 00 00 83 fb 02 41 0f 44 d5 89 14 25 00 b3 57 ff <41> f7 c4 00 02 00 00 75 1a 4c 89 e7 57 9d 0f 1f 44 00 00 e8 8b git bisect start 1918664d0b2b6b2ad6b180e7c04a42872beb6691 d8524ae9d6f492a9c6db9f4d89c5f9b8782fa2d5 -- git bisect good 4d9218daae2b00379b2bb81eaf4fe5143f372db1 # 11:54 27+ 0 Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core git bisect good f05ef3f6f78e6b97f5664a8290f1ac0717b1a1b0 # 12:09 27+ 2 kvm tools: Switch to using an enum for disk image types git bisect good 3784592e0c0f224e399605cfa2d1570f79c0cc29 # 12:17 27+ 6 config: fix make kvmconfig git bisect bad 9c028f0df729ba4958fe7b142d8521b66ee154d9 # 12:31 6- 4 Merge branch 'sched/core' git bisect good 04bb2f9475054298f0c67a89ca92cade42d3fe5e # 12:45 50+ 0 sched/numa: Adjust scan rate in task_numa_placement git bisect good ecf1f014325ba60f4df35edae1a357c67c5d4eb1 # 12:56 50+ 3 Merge branch 'core/rcu' into core/locking, to prepare the kernel/locking/ file move git bisect bad 01768b42dc97a67b4fb33a2535c49fc1969880df # 13:08 21- 18 locking: Move the mutex code to kernel/locking/ git bisect bad c2d816443ef305aba8eaf0bf368f4d3d87494f06 # 13:22 4- 7 sched/wait: Introduce prepare_to_wait_event() git bisect good ec0ad3d01f99d5e5b56a99a58f7003b99250dc65 # 14:11 50+ 9 Merge branch 'core/urgent' into sched/core git bisect good 7c3f2ab7b844f1a859afbc3d41925e8a0faba5fa # 14:50 50+ 4 sched/rt: Add missing rmb() git bisect bad 6acce3ef84520537f8a09a12c9ddbe814a584dd2 # 14:55 13- 3 sched: Remove get_online_cpus() usage git bisect good 746023159c40c523b08a3bc3d213dac212385895 # 15:33 150+ 6 sched: Fix race in migrate_swap_stop() git bisect good 746023159c40c523b08a3bc3d213dac212385895 # 16:04 450+ 6 sched: Fix race in migrate_swap_stop() git bisect bad 1918664d0b2b6b2ad6b180e7c04a42872beb6691 # 16:04 0- 12 Merge branch 'x86/intel-mid' git bisect good 1ae1820bfe3f0964cf4bc3c5628055330e3cc538 # 16:47 450+ 0 Revert "sched: Remove get_online_cpus() usage" git bisect good 6c86ae2928f9e4cbf0d5844f5fcfd549e3450b8c # 17:26 450+ 63 Merge tag 'ftrace-urgent-3.12-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace git bisect bad bad570d77024cba808126d68a2072f6f8ce64c27 # 17:37 32- 16 Add linux-next specific files for 20131108 git bisect bad 1918664d0b2b6b2ad6b180e7c04a42872beb6691 # 17:37 0- 12 Merge branch 'x86/intel-mid' Thanks, Fengguang View attachment "dmesg-yocto-jaketown-20:20131108223032:x86_64-randconfig-x4-1108:3.12.0-02098-g1918664:98" of type "text/plain" (57763 bytes) Download attachment "bisect-1918664d0b2b6b2ad6b180e7c04a42872beb6691-x86_64-randconfig-x4-1108-Kernel-panic---not-syncing:-hung_task:-blocked-tasks-112033.log" of type "application/octet-stream" (39959 bytes) View attachment "config-3.12.0-02098-g1918664" of type "text/plain" (74647 bytes) _______________________________________________ LKP mailing list LKP@...ux.intel.com
Powered by blists - more mailing lists