linux-kernel - Re: Processes spinning forever, apparently in lock_timer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1190462915.3156.5.camel@castor.rsk.org>
Date:	Sat, 22 Sep 2007 13:08:35 +0100
From:	richard kennedy <richard@....demon.co.uk>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Chuck Ebbert <cebbert@...hat.com>,
	Matthias Hensler <matthias@...se.de>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: Processes spinning forever, apparently in lock_timer_base()?

On Fri, 2007-09-21 at 03:33 -0700, Andrew Morton wrote:
> On Fri, 21 Sep 2007 11:25:41 +0100 richard kennedy <richard@....demon.co.uk> wrote:
> 
> > > That's all a bit crappy if the wrong races happen and some other task is
> > > somehow exceeding the dirty limits each time this task polls them.  Seems
> > > unlikely that such a condition would persist forever.
> > > 
> > > So the question is, why do we have large amounts of dirty pages for one
> > > disk which appear to be sitting there not getting written?
> > 
> > The lockup I'm seeing intermittently occurs when I have 2+ tasks copying
> > large files (1Gb+) on sda & a small read-mainly mysql db app running on
> > sdb. The lockup seems to happen just after the copies finish -- there
> > are lots of dirty pages but nothing left to write them until kupdate
> > gets round to it. 
> 
> Then what happens?  The system recovers?
when my system is locked up I get this from sysrq-w
(2.6.23-rc7)

SysRq : Show Blocked State
auditd        D ffff8100b422fd28     0  1999      1
 ffff8100b422fd78 0000000000000086 0000000000000000 ffff8100b4103020
 0000000000000286 ffff8100b4103020 ffff8100bf8af020 ffff8100b41032b8
 0000000100000000 0000000000000001 ffffffffffffffff 0000000000000003
Call Trace:
 [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
 [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
 [<ffffffff810a79f7>] __writeback_single_inode+0x1f4/0x332
 [<ffffffff8108c0b0>] vfs_statfs_native+0x29/0x34
 [<ffffffff810a8362>] sync_inode+0x24/0x36
 [<ffffffff8803c350>] :ext3:ext3_sync_file+0xb4/0xc8
 [<ffffffff81240730>] mutex_lock+0x1e/0x2f
 [<ffffffff810aa7fa>] do_fsync+0x52/0xa4
 [<ffffffff810aa86f>] __do_fsync+0x23/0x36
 [<ffffffff8100bc8e>] system_call+0x7e/0x83

syslogd       D ffff8100b3f67d28     0  2022      1
 ffff8100b3f67d78 0000000000000086 0000000000000000 0000000100000000
 0000000000000003 ffff810037c66810 ffff8100bf8af020 ffff810037c66aa8
 0000000100000000 0000000000000001 ffffffffffffffff 0000000000000003
Call Trace:
 [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
 [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
 [<ffffffff810a79f7>] __writeback_single_inode+0x1f4/0x332
 [<ffffffff810a8362>] sync_inode+0x24/0x36
 [<ffffffff8803c350>] :ext3:ext3_sync_file+0xb4/0xc8
 [<ffffffff81240730>] mutex_lock+0x1e/0x2f
 [<ffffffff810aa7fa>] do_fsync+0x52/0xa4
 [<ffffffff810aa86f>] __do_fsync+0x23/0x36
 [<ffffffff8100be0c>] tracesys+0xdc/0xe1


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/