lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <4970F2B6.1060508@cosmosbay.com> Date: Fri, 16 Jan 2009 21:48:54 +0100 From: Eric Dumazet <dada1@...mosbay.com> To: Vegard Nossum <vegard.nossum@...il.com> CC: lkml <linux-kernel@...r.kernel.org>, Linux Netdev List <netdev@...r.kernel.org> Subject: Re: 2.6.27.9: splice_to_pipe() hung (blocked for more than 120 seconds) CCed to netdev Vegard Nossum a écrit : > Hi, > > Seeing some recent splice() discussions, I decided to explore this > system call. I have written a program which might look, well, not very > useful, but the fact is that it creates an unkillable zombie process. > Another funny side effect is that system load continually rises, even > though the system seems to stay fully interactive and functional. > > After a while, I also get some messages like this: > > Jan 15 20:11:37 localhost kernel: INFO: task a.out:7149 blocked for > more than 120 seconds. > Jan 15 20:11:37 localhost kernel: "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Jan 15 20:11:37 localhost kernel: a.out D ec6e2610 0 7149 1 > Jan 15 20:11:37 localhost kernel: ec5aad44 00000082 c042451f > ec6e2610 00989680 c07da67c c07ddb80 c07ddb80 > Jan 15 20:11:37 localhost kernel: c07ddb80 ec6e4c20 ec6e4e7c > c201db80 00000001 c201db80 470fed45 0000036b > Jan 15 20:11:37 localhost kernel: ec5aad38 c0421027 ec6e263c > ec6e4e7c ec6e3fa8 85c129f4 ec6e4c20 ec6e4c20 > Jan 15 20:11:37 localhost kernel: Call Trace: > Jan 15 20:11:37 localhost kernel: [<c064420f>] __mutex_lock_common+0x8a/0xd9 > Jan 15 20:11:37 localhost kernel: [<c0644302>] __mutex_lock_slowpath+0x12/0x15 > Jan 15 20:11:37 localhost kernel: [<c0644181>] mutex_lock+0x29/0x2d > Jan 15 20:11:37 localhost kernel: [<c04aa8f1>] splice_to_pipe+0x23/0x1f5 > Jan 15 20:11:37 localhost kernel: [<c04ab290>] > __generic_file_splice_read+0x3ff/0x413 > Jan 15 20:11:37 localhost kernel: [<c04ab324>] > generic_file_splice_read+0x80/0x9a > Jan 15 20:11:37 localhost kernel: [<c04a9e95>] do_splice_to+0x4e/0x5f > Jan 15 20:11:37 localhost kernel: [<c04aa010>] sys_splice+0x16a/0x1c8 > Jan 15 20:11:37 localhost kernel: [<c0403cca>] syscall_call+0x7/0xb > Jan 15 20:11:37 localhost kernel: ======================= > > (but this was from such a system with 6 zombies and ~80 load. See > attachments for SysRq report with processes in blocked state, it has > similar info but for just one zombie.) > > This happens with 2.6.27.9-73.fc9.i686 kernel. Maybe it was fixed > recently? (In any case, I don't think it is a regression.) > > It seems to be not 100% reproducible. Sometimes it works, sometimes > not. Start the program, then after a while hit Ctrl-C. If it doesn't > exit, zombie count will rise and system state will be as described. > Compile with -lpthread. > I tried your program on latest git tree and could not reproduce any problem. (changed to 9 threads since I have 8 cpus) Problem might be that your threads all fight on the same pipe, with a mutex protecting its inode. So mutex_lock() could possibly starve for more than 120 second ? Maybe you can reproduce the problem using standard read()/write() syscalls... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists