linux-kernel - Re: Subject: Warning in workqueue.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 13 Feb 2014 12:58:10 -0500
From:	"Jason J. Herne" <jjherne@...ux.vnet.ibm.com>
To:	Lai Jiangshan <laijs@...fujitsu.com>
CC:	Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: Subject: Warning in workqueue.c

On 02/12/2014 10:31 PM, Lai Jiangshan wrote:
> On 02/12/2014 11:18 PM, Jason J. Herne wrote:

> Could you use the following patch for test if Tejun doesn't give you a new one.

Lai,

Here is the output using the patch you asked me to run with.

[ 5779.795687] ------------[ cut here ]------------
[ 5779.795695] WARNING: at kernel/workqueue.c:2159
[ 5779.795698] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
xt_CHECKSUM iptable_mangle bridge stp llc ip6table_filter ip6_tables 
ebtable_nat ebtables iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi tape_3590 qeth_l2 tape tape_class vhost_net tun 
vhost macvtap macvlan lcs dasd_eckd_mod dasd_mod qeth ccwgroup zfcp 
scsi_transport_fc scsi_tgt qdio dm_multipath [last unloaded: kvm]
[ 5779.795733] CPU: 4 PID: 270 Comm: kworker/5:1 Not tainted 3.14.0-rc1 #1
[ 5779.795738] task: 0000000001938000 ti: 00000000f4d9c000 task.ti: 
00000000f4d9c000
[ 5779.795750] Krnl PSW : 0404c00180000000 000000000015b452 
(process_one_work+0x666/0x688)
[ 5779.795756]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 
PM:0 EA:3
Krnl GPRS: 000003210f9db000 0000000000bc2a52 0000000001b640c0 
0000000000000001
[ 5779.795757]            0000000000000000 0000000000000004 
0000000000000005 00000000ffffffff
[ 5779.795759]            0000000000000000 0000000084a43500 
0000000084a3f000 0000000084a3f018
[ 5779.795763]            0000000001b640c0 0000000000735d18 
00000000f4d9fdc8 00000000f4d9fd50
[ 5779.795781] Krnl Code: 000000000015b444: dd1a9640c05b	trt 
1600(27,%r9),91(%r12)
            000000000015b44a: a7f4fd9e		brc	15,15af86
           #000000000015b44e: a7f40001		brc	15,15b450
           >000000000015b452: 92011000		mvi	0(%r1),1
            000000000015b456: a7f4fe63		brc	15,15b11c
            000000000015b45a: c03000533af9	larl	%r3,bc2a4c
            000000000015b460: 95003000		cli	0(%r3),0
            000000000015b464: a774ff3e		brc	7,15b2e0
[ 5779.795810] Call Trace:
[ 5779.795814] ([<000000000015b0ea>] process_one_work+0x2fe/0x688)
[ 5779.795817]  [<000000000015ba62>] worker_thread+0x1a6/0x3d4
[ 5779.795822]  [<00000000001648c2>] kthread+0x10e/0x128
[ 5779.795828]  [<0000000000728ed6>] kernel_thread_starter+0x6/0xc
[ 5779.795832]  [<0000000000728ed0>] kernel_thread_starter+0x0/0xc
[ 5779.795834] Last Breaking-Event-Address:
[ 5779.795837]  [<000000000015b44e>] process_one_work+0x662/0x688
[ 5779.795840] ---[ end trace 8b6353b0f2821ec9 ]---
[ 5779.795844] XXX: worker->flags=0x1 pool->flags=0x0 cpu=4 
pool->cpu=5(1) rescue_wq=          (null)
[ 5779.795848] XXX: last_unbind=-44 last_rebind=0 last_rebound_clear=0 
nr_exected_after_rebound_clear=0
[ 5779.795852] XXX: sleep=-39 wakeup=0
[ 5779.795855] XXX: cpus_allowed=5
[ 5779.795857] XXX: cpus_allowed_after_rebinding=5
[ 5779.795861] XXX: after schedule(), cpu=4

You had asked about reproducing this. This is on the S390 platform, I'm 
not sure if that makes any difference.

The workload is:
2 processes onlining random cpus in a tight loop by using 'echo 1 > 
/sys/bus/cpu.../online'
2 processes offlining random cpus in a tight loop by using 'echo 0 > 
/sys/bus/cpu.../online'
Otherwise, fairly idle system. load average: 5.82, 6.27, 6.27

The machine has 10 processors.
The warning message some times hits within a few minutes on starting the 
workload. Other times it takes several hours.

Please let me know if you have further questions.

-- 
-- Jason J. Herne (jjherne@...ux.vnet.ibm.com)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/