[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120603223617.GB7707@redhat.com>
Date: Sun, 3 Jun 2012 18:36:17 -0400
From: Dave Jones <davej@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Al Viro <viro@...IV.linux.org.uk>,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: processes hung after sys_renameat, and 'missing' processes
I noticed I had a ton of core dumps (like 70G worth) in a directory
I hadn't cleaned up in a while, and set about deleting them.
After a while I noticed the rm wasn't making any progress.
Even more strange, the rm process doesn't show up in the process list.
The shell that spawned it is still there, with no child processes,
but it hasn't returned to accept new input. (no message of oom kills or
anything, just totally missing pids).
I did sysrq-t to see if it showed up there. It didn't, but.. I noticed
a ton of processes from my syscall fuzzer were still around, and all
of them were stuck in this trace..
trinity-child2 D 0000000000000000 5528 13066 1 0x00000004
ffff880100a37ce8 0000000000000046 0000000000000006 ffff880129070000
ffff880129070000 ffff880100a37fd8 ffff880100a37fd8 ffff880100a37fd8
ffff880145ec4d60 ffff880129070000 ffff880100a37cd8 ffff88014784e2a0
Call Trace:
[<ffffffff8164b919>] schedule+0x29/0x70
[<ffffffff8164bca8>] schedule_preempt_disabled+0x18/0x30
[<ffffffff8164a186>] mutex_lock_nested+0x196/0x3b0
[<ffffffff811b6d6e>] ? lock_rename+0x3e/0xf0
[<ffffffff811b6d6e>] ? lock_rename+0x3e/0xf0
[<ffffffff811b6d6e>] lock_rename+0x3e/0xf0
[<ffffffff811bcaca>] sys_renameat+0x11a/0x230
[<ffffffff8164d738>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[<ffffffff81050e1c>] ? do_setitimer+0x1cc/0x310
[<ffffffff810b1d7e>] ? put_lock_stats.isra.23+0xe/0x40
[<ffffffff8164d6d0>] ? _raw_spin_unlock_irq+0x30/0x60
[<ffffffff81086f81>] ? get_parent_ip+0x11/0x50
[<ffffffff81655177>] ? sysret_check+0x1b/0x56
[<ffffffff810b7cd5>] ? trace_hardirqs_on_caller+0x115/0x1a0
[<ffffffff813264be>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff811bcbfb>] sys_rename+0x1b/0x20
[<ffffffff81655152>] system_call_fastpath+0x16/0x1b
The whole sysrq-t is attached.
I ran mc to try and kill off all those core files, as I was running low on disk space,
and it deleted them without problem.
The two bash processes are chewing up 100% CPU, though strace shows no output.
It's still up and in this state if you want me to gather any further info
before I reboot it.
Dave
View attachment "dmesg.out" of type "text/plain" (108688 bytes)
Powered by blists - more mailing lists