lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141028142541.GA19097@wfg-t540p.sh.intel.com>
Date:	Tue, 28 Oct 2014 22:25:41 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	LKP <lkp@...org>, linux-kernel@...r.kernel.org
Subject: WARNING: CPU: 0 PID: 61 at kernel/sched/core.c:7312 __might_sleep()

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/wait
commit 245747099820df3007f60128b1264fef9d2a69d2
Author:     Peter Zijlstra <peterz@...radead.org>
AuthorDate: Wed Sep 24 10:18:55 2014 +0200
Commit:     Peter Zijlstra <peterz@...radead.org>
CommitDate: Mon Oct 27 10:42:51 2014 +0100

    sched: Debug nested sleeps
    
    Validate we call might_sleep() with TASK_RUNNING, which catches places
    where we nest blocking primitives, eg. mutex usage in a wait loop.
    
    Since all blocking is arranged through task_struct::state, nesting
    this will cause the inner primitive to set TASK_RUNNING and the outer
    will thus not block.
    
    Another observed problem is calling a blocking function from
    schedule()->sched_submit_work()->blk_schedule_flush_plug() which will
    then destroy the task state for the actual __schedule() call that
    comes after it.
    
    Cc: torvalds@...ux-foundation.org
    Cc: tglx@...utronix.de
    Cc: ilya.dryomov@...tank.com
    Cc: umgwanakikbuti@...il.com
    Cc: mingo@...nel.org
    Cc: oleg@...hat.com
    
    Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
    Link: http://lkml.kernel.org/r/20140924082242.591637616@infradead.org

===================================================
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===================================================
120 /kernel/i386-randconfig-r2-1027/592ed717ef33150f6888c333c28021283cc9aabc

To bisect errors in parent:
/c/kernel-tests/queue-reproduce /kernel/i386-randconfig-r2-1027/592ed717ef33150f6888c333c28021283cc9aabc/dmesg-quantal-kbuild-20:20141027231410:i386-randconfig-r2-1027:3.18.0-rc2-00036-g592ed71:139 BUG: kernel test crashed

Attached dmesg for the parent commit, too, to help confirm whether it is a noise error.

+---------------------------------------------------+------------+------------+------------+
|                                                   | 592ed717ef | 2457470998 | 2d55520314 |
+---------------------------------------------------+------------+------------+------------+
| boot_successes                                    | 1080       | 267        | 110        |
| boot_failures                                     | 120        | 33         | 21         |
| BUG:kernel_test_crashed                           | 110        | 30         | 16         |
| WARNING:at_kernel/locking/lockdep.c:check_flags() | 10         | 0          | 3          |
| backtrace:might_fault                             | 2          |            |            |
| backtrace:SyS_perf_event_open                     | 3          | 0          | 1          |
| backtrace:mutex_lock_nested                       | 1          |            |            |
| WARNING:at_kernel/sched/core.c:__might_sleep()    | 0          | 3          | 2          |
| backtrace:cleanup_net                             | 0          | 3          | 2          |
| backtrace:register_perf_hw_breakpoint             | 0          | 0          | 1          |
| backtrace:hw_breakpoint_event_init                | 0          | 0          | 1          |
| backtrace:perf_init_event                         | 0          | 0          | 1          |
| backtrace:perf_event_alloc                        | 0          | 0          | 1          |
+---------------------------------------------------+------------+------------+------------+

[  122.133640] Fix your initscripts?
[  122.133905] trinity-c0 (23733) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.txt.
[  122.247299] ------------[ cut here ]------------
[  122.247328] WARNING: CPU: 0 PID: 61 at kernel/sched/core.c:7312 __might_sleep+0x50/0x249()
[  122.247334] do not call blocking ops when !TASK_RUNNING; state=2 set at [<c106ffd9>] prepare_to_wait+0x3c/0x5f
[  122.247339] Modules linked in:
[  122.247349] CPU: 0 PID: 61 Comm: kworker/u2:1 Not tainted 3.18.0-rc2-00037-g24574709 #136
[  122.247350] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  122.247368] Workqueue: netns cleanup_net
[  122.247377]  c1071d83 d2b83dd8 d2b83dac c15887b1 d2b83dc8 c104c4c6 00001c90 c1068ebf
[  122.247383]  00000000 c17b67e3 0000026d d2b83de0 c104c508 00000009 d2b83dd8 c17b5d4b
[  122.247388]  d2b83df4 d2b83e0c c1068ebf c17b5cec 00001c90 c17b5d4b 00000002 c106ffd9
[  122.247389] Call Trace:
[  122.247393]  [<c1071d83>] ? down_trylock+0x23/0x2c
[  122.247402]  [<c15887b1>] dump_stack+0x16/0x18
[  122.247413]  [<c104c4c6>] warn_slowpath_common+0x66/0x7d
[  122.247416]  [<c1068ebf>] ? __might_sleep+0x50/0x249
[  122.247419]  [<c104c508>] warn_slowpath_fmt+0x2b/0x2f
[  122.247422]  [<c1068ebf>] __might_sleep+0x50/0x249
[  122.247424]  [<c106ffd9>] ? prepare_to_wait+0x3c/0x5f
[  122.247426]  [<c106ffd9>] ? prepare_to_wait+0x3c/0x5f
[  122.247432]  [<c158c364>] mutex_lock_nested+0x23/0x347
[  122.247436]  [<c1075105>] ? trace_hardirqs_on+0xb/0xd
[  122.247439]  [<c158eb0c>] ? _raw_spin_unlock_irqrestore+0x66/0x78
[  122.247445]  [<c1570e10>] rtnl_lock+0x14/0x16
[  122.247449]  [<c156516b>] default_device_exit_batch+0x54/0xf3
[  122.247452]  [<c1570e1f>] ? rtnl_unlock+0xd/0xf
[  122.247454]  [<c1070233>] ? __wake_up_sync+0x12/0x12
[  122.247461]  [<c155e35d>] ops_exit_list+0x20/0x40
[  122.247464]  [<c155ec96>] cleanup_net+0xbe/0x140
[  122.247473]  [<c105ffe4>] process_one_work+0x29e/0x643
[  122.247479]  [<c1061215>] worker_thread+0x23a/0x311
[  122.247482]  [<c1060fdb>] ? rescuer_thread+0x204/0x204
[  122.247486]  [<c10648cc>] kthread+0xbe/0xc3
[  122.247490]  [<c158f4c0>] ret_from_kernel_thread+0x20/0x30
[  122.247492]  [<c106480e>] ? kthread_stop+0x364/0x364
[  122.247495] ---[ end trace 2073c37ae3c8b3b4 ]---
[  157.390879] Unregister pv shared memory for cpu 0

git bisect start 2d55520314eb5603b855ac1b994705dc6a352d9e 522e980064c24d3dd9859e9375e17417496567cf --
git bisect good c3f9b6ec744e12ff09677c4c0cb3164ad5b62702  # 19:25    300+     36  Merge branch 'sched/core'
git bisect good 344c57c17c7f857f9c92317e0d5cbb5c59f8d6e0  # 19:49    300+     62  Merge branch 'perf/urgent'
git bisect good 54de76b06a8098c11f15857a57e23c6e630a34b6  # 20:19    300+     66  Merge branch 'perf/core'
git bisect good 126b6dbcbedb5c0defe5c39e0310feed061569bf  # 20:51    300+     50  exit: Deal with nested sleeps
git bisect good 8641f9cba8ce5f3bfc5da47861180617cbfc6e7f  # 22:02    300+     68  module: Fix nested sleep
git bisect  bad 245747099820df3007f60128b1264fef9d2a69d2  # 22:25    142-     18  sched: Debug nested sleeps
git bisect good 592ed717ef33150f6888c333c28021283cc9aabc  # 22:59    300+     27  net: Clean up sk_wait_event() vs might_sleep()
# first bad commit: [245747099820df3007f60128b1264fef9d2a69d2] sched: Debug nested sleeps
git bisect good 592ed717ef33150f6888c333c28021283cc9aabc  # 00:15    900+    120  net: Clean up sk_wait_event() vs might_sleep()
git bisect  bad 2d55520314eb5603b855ac1b994705dc6a352d9e  # 00:19      0-     21  Merge branch 'sched/wait'
git bisect good cac7f2429872d3733dc3f9915857b1691da2eb2f  # 01:33    900+     66  Linux 3.18-rc2
git bisect good 7a891e6323e963f3301e44bdeee734028e34d390  # 02:26    900+     93  Add linux-next specific files for 20141027


This script may reproduce the error.

----------------------------------------------------------------------------
#!/bin/bash

kernel=$1
initrd=yocto-minimal-i386.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
	qemu-system-x86_64
	-cpu kvm64
	-enable-kvm
	-kernel $kernel
	-initrd $initrd
	-m 320
	-smp 1
	-net nic,vlan=1,model=e1000
	-net user,vlan=1
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-rtc base=localtime
	-serial stdio
	-display none
	-monitor null 
)

append=(
	hung_task_panic=1
	earlyprintk=ttyS0,115200
	debug
	apic=debug
	sysrq_always_enabled
	rcupdate.rcu_cpu_stall_timeout=100
	panic=-1
	softlockup_panic=1
	nmi_watchdog=panic
	oops=panic
	load_ramdisk=2
	prompt_ramdisk=0
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
	drbd.minor_count=8
)

"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------

Thanks,
Fengguang

View attachment "dmesg-yocto-vp-26:20141027222812:i386-randconfig-r2-1027:3.18.0-rc2-00037-g24574709:136" of type "text/plain" (49334 bytes)

View attachment "dmesg-quantal-kbuild-20:20141027231410:i386-randconfig-r2-1027:3.18.0-rc2-00036-g592ed71:139" of type "text/plain" (46065 bytes)

View attachment "config-3.18.0-rc2-00037-g24574709" of type "text/plain" (92205 bytes)

_______________________________________________
LKP mailing list
LKP@...ux.intel.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ