lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140708021149.GA12956@localhost>
Date:	Tue, 8 Jul 2014 10:11:49 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Jet Chen <jet.chen@...el.com>, Su Tao <tao.su@...el.com>,
	Yuanhan Liu <yuanhan.liu@...el.com>, LKP <lkp@...org>,
	linux-kernel@...r.kernel.org
Subject: [workqueue] WARNING: CPU: 0 PID: 76 at drivers/block/floppy.c:968
 schedule_bh()

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
commit b22ce2785d97423846206cceec4efee0c4afd980
Author:     Tejun Heo <tj@...nel.org>
AuthorDate: Wed Aug 28 17:33:37 2013 -0400
Commit:     Tejun Heo <tj@...nel.org>
CommitDate: Thu Aug 29 09:19:28 2013 -0400

    workqueue: cond_resched() after processing each work item
    
    If !PREEMPT, a kworker running work items back to back can hog CPU.
    This becomes dangerous when a self-requeueing work item which is
    waiting for something to happen races against stop_machine.  Such
    self-requeueing work item would requeue itself indefinitely hogging
    the kworker and CPU it's running on while stop_machine would wait for
    that CPU to enter stop_machine while preventing anything else from
    happening on all other CPUs.  The two would deadlock.
    
    Jamie Liu reports that this deadlock scenario exists around
    scsi_requeue_run_queue() and libata port multiplier support, where one
    port may exclude command processing from other ports.  With the right
    timing, scsi_requeue_run_queue() can end up requeueing itself trying
    to execute an IO which is asked to be retried while another device has
    an exclusive access, which in turn can't make forward progress due to
    stop_machine.
    
    Fix it by invoking cond_resched() after executing each work item.
    
    Signed-off-by: Tejun Heo <tj@...nel.org>
    Reported-by: Jamie Liu <jamieliu@...gle.com>
    References: http://thread.gmane.org/gmane.linux.kernel/1552567
    Cc: stable@...r.kernel.org
    --
     kernel/workqueue.c |    9 +++++++++
     1 file changed, 9 insertions(+)

+---------------------------------------------------------+------------+------------+------------------+
|                                                         | c95389b4cd | b22ce2785d | v3.16-rc4_070708 |
+---------------------------------------------------------+------------+------------+------------------+
| boot_successes                                          | 900        | 249        | 79               |
| boot_failures                                           | 0          | 51         | 2                |
| WARNING:CPU:PID:at_drivers/block/floppy.c:schedule_bh() | 0          | 51         | 2                |
| backtrace:user_reset_fdc                                | 0          | 51         | 2                |
| backtrace:floppy_async_init                             | 0          | 51         | 2                |
| backtrace:async_run_entry_fn                            | 0          | 51         | 2                |
+---------------------------------------------------------+------------+------------+------------------+

[   45.121121] PCI: setting IRQ 11 as level-triggered
[   45.407538] e1000 0000:00:03.0: setting latency timer to 64
[   45.686040] ------------[ cut here ]------------
[   45.975060] WARNING: CPU: 0 PID: 76 at drivers/block/floppy.c:968 schedule_bh+0x47/0x80()
[   47.312537] CPU: 0 PID: 76 Comm: kworker/u2:1 Not tainted 3.11.0-rc7-00025-gb22ce27 #8
[   47.848171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   48.433222] Workqueue: events_unbound async_run_entry_fn
[   48.725491]  00000000 d25d7e0c c2745b10 d25d7e3c c105472d c2ff0b10 00000000 0000004c
[   49.494639]  c30df57c 000003c8 c1a2fb57 c1a2fb57 00000001 c1a33aa0 c39a0ca0 d25d7e4c
[   51.548252]  c1054812 00000009 00000000 d25d7e5c c1a2fb57 00000000 00000000 d25d7e84
[   56.319716] Call Trace:
[   58.164457]  [<c2745b10>] dump_stack+0x16/0x18
[   61.355641]  [<c105472d>] warn_slowpath_common+0x8d/0xb0
[   61.822789]  [<c1a2fb57>] ? schedule_bh+0x47/0x80
[   62.103589]  [<c1a2fb57>] ? schedule_bh+0x47/0x80
[   62.538170]  [<c1a33aa0>] ? floppy_hardint+0x1a0/0x1a0
[   62.920432]  [<c1054812>] warn_slowpath_null+0x22/0x30
[   63.411821]  [<c1a2fb57>] schedule_bh+0x47/0x80
[   66.271423]  [<c1a33ed5>] wait_til_done+0x15/0x230
[   71.033237]  [<c1063260>] ? cascade+0xa0/0xa0
[   75.990887]  [<c1a348d7>] user_reset_fdc+0xa7/0x110
[   78.921153]  [<c384c939>] floppy_async_init+0x9ee/0x1263
[   83.512228]  [<c108a0e4>] async_run_entry_fn+0x44/0x1e0
[   85.519262]  [<c16901b2>] ? __list_del_entry+0x202/0x220
[   87.120223]  [<c1075687>] ? pwq_dec_nr_in_flight+0x137/0x140
[   89.499077]  [<c1075f5b>] process_one_work+0x33b/0x840
[   90.274017]  [<c108f542>] ? ttwu_do_wakeup+0x12/0x90
[   91.294072]  [<c16901b2>] ? __list_del_entry+0x202/0x220
[   91.634105]  [<c1076874>] worker_thread+0x414/0x890
[   91.994817]  [<c1076460>] ? process_one_work+0x840/0x840
[   92.265965]  [<c1080ba6>] kthread+0xa6/0xb0
[   92.658144]  [<c2763f57>] ret_from_kernel_thread+0x1b/0x28
[   92.971880]  [<c1080b00>] ? __kthread_unpark+0x70/0x70
[   93.254541] ---[ end trace 857f8343e4eb716b ]---
[   93.508601] floppy0: no floppy controllers found

git bisect start v3.11 v3.10 --
git bisect good 89d0abe3d695103505c025dde6e07b9c3dd772f4  # 13:01    300+      0  Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
git bisect good 8742f229b635bf1c1c84a3dfe5e47c814c20b5c8  # 13:18    300+      0  userns: limit the maximum depth of user_namespace->parent chain
git bisect good 94fc5d9de5bd757ad46f0d94bc4ebf617c4487f6  # 13:35    300+      0  proc: return on proc_readdir error
git bisect good 5e25e4f304ce53004dcd01725938c0bc45c75d35  # 13:48    300+      0  Merge tag 'usb-3.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 06a557f7a68e126181f09831cb2fac6f7a7f43c2  # 14:22    300+      1  Merge tag 'nfs-for-3.11-5' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
git bisect  bad 936dbcc3f2fc7314f3c1645fc39ea51117f6d73b  # 14:36     15-      2  Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect  bad 155e3a353943a3e713cd33357cfd58e995900a81  # 14:50     13-      1  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect  bad ff497452636f4687e517964817b7e2bd99f4b44b  # 15:06     48-      2  Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
git bisect  bad b22ce2785d97423846206cceec4efee0c4afd980  # 15:51     13-      2  workqueue: cond_resched() after processing each work item
# first bad commit: [b22ce2785d97423846206cceec4efee0c4afd980] workqueue: cond_resched() after processing each work item
git bisect good c95389b4cd6a4b52af78bea706a274453e886251  # 16:51    900+      0  Merge branch 'akpm' (patches from Andrew Morton)
git bisect  bad 98610962bbb418f6d847c402637ca14e78c38344  # 16:51      0-      2  0day head guard for 'devel-hourly-2014070708'
git bisect  bad cd3de83f147601356395b57a8673e9c5ff1e59d1  # 16:51      0-     34  Linux 3.16-rc4
git bisect  bad d584f2acf5cc6c4a14c330930222d8e70fe024a6  # 18:02     62-      1  Add linux-next specific files for 20140707


This script may reproduce the error.

-----------------------------------------------------------------------------
#!/bin/bash

kernel=$1
initrd=yocto-minimal-i386.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/blob/master/initrd/$initrd

kvm=(
	qemu-system-x86_64 -cpu kvm64 -enable-kvm 
	-kernel $kernel
	-initrd $initrd
	-smp 2
	-m 256M
	-net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio
	-net user,vlan=0
	-net nic,vlan=1,model=e1000
	-net user,vlan=1
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-serial stdio
	-display none
	-monitor null
)

append=(
	debug
	sched_debug
	apic=debug
	ignore_loglevel
	sysrq_always_enabled
	panic=10
	prompt_ramdisk=0
	earlyprintk=ttyS0,115200
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
)

"${kvm[@]}" --append "${append[*]}"
-----------------------------------------------------------------------------

Thanks,
Fengguang

View attachment "dmesg-yocto-ivb41-99:20140707155130:i386-randconfig-c1-07070935::" of type "text/plain" (48547 bytes)

Download attachment "i386-randconfig-c1-07070935-98610962bbb418f6d847c402637ca14e78c38344-WARNING:---at----schedule_bh+-x-71922.log" of type "application/octet-stream" (62212 bytes)

View attachment "config-3.11.0-rc7-00025-gb22ce27" of type "text/plain" (84480 bytes)

_______________________________________________
LKP mailing list
LKP@...ux.intel.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ