linux-kernel - Re: Processes spinning forever, apparently in lock_timer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070920153654.b9e90616.akpm@linux-foundation.org>
Date:	Thu, 20 Sep 2007 15:36:54 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Chuck Ebbert <cebbert@...hat.com>
Cc:	Matthias Hensler <matthias@...se.de>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	richard kennedy <richard@....demon.co.uk>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: Processes spinning forever, apparently in lock_timer_base()?

On Thu, 20 Sep 2007 18:04:38 -0400
Chuck Ebbert <cebbert@...hat.com> wrote:

> > 
> >> Can we get some kind of band-aid, like making the endless 'for' loop in
> >> balance_dirty_pages() terminate after some number of iterations? Clearly
> >> if we haven't written "write_chunk" pages after a few tries, *and* we
> >> haven't encountered congestion, there's no point in trying forever...
> > 
> > Did my above questions get looked at?
> > 
> > Is anyone able to reproduce this?
> > 
> > Do we have a clue what's happening?
> 
> There are a ton of dirty pages for one disk, and zero or close to zero dirty
> for a different one. Kernel spins forever trying to write some arbitrary
> minimum amount of data ("write_chunk" pages) to the second disk...

That should be OK.  The caller will sit in that loop, sleeping in
congestion_wait(), polling the correct backing-dev occasionally and waiting
until the dirty limits subside to an acceptable limit, at which stage this:

			if (nr_reclaimable +
				global_page_state(NR_WRITEBACK)
					<= dirty_thresh)
						break;

will happen and we leave balance_dirty_pages().

That's all a bit crappy if the wrong races happen and some other task is
somehow exceeding the dirty limits each time this task polls them.  Seems
unlikely that such a condition would persist forever.

So the question is, why do we have large amounts of dirty pages for one
disk which appear to be sitting there not getting written?

Do we know if there's any writeout at all happening when the system is in
this state?

I guess it's possible that the dirty inodes on the "other" disk got
themselves onto the wrong per-sb inode list, or are on the correct list,
but in the correct place.  If so, these:

writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-2.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-3.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-4.patch
writeback-fix-comment-use-helper-function.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-5.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-6.patch
writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-7.patch
writeback-fix-periodic-superblock-dirty-inode-flushing.patch

from 2.6.23-rc6-mm1 should help.

Did anyone try running /bin/sync when the system is in this state?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/