linux-kernel - Re: 2.6.36 io bring the system to its knees

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101028170132.GY27796@think>
Date:	Thu, 28 Oct 2010 13:01:32 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Pekka Enberg <penberg@...nel.org>,
	Aidar Kultayev <the.aidar@...il.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jens Axboe <axboe@...nel.dk>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Arjan van de Ven <arjan@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: 2.6.36 io bring the system to its knees

On Thu, Oct 28, 2010 at 03:30:36PM +0200, Ingo Molnar wrote:
> 
> "Many seconds freezes" and slowdowns wont be fixed via the VFS scalability patches 
> i'm afraid.
> 
> This has the appearance of some really bad IO or VM latency problem. Unfixed and 
> present in stable kernel versions going from years ago all the way to v2.6.36.

Hmmm, the workload you're describing here has two special parts.  First
it dramatically overloads the disk, and then it has guis doing things
waiting for the disk.

The virtualbox part of the workload is probably filling the queue with
huge amounts of synchronous random IO (I'm assuming it is going in via
O_DIRECT), and this will defeat any attempts from the filesystem to tell
the elevator "hey look, my IO is synchronous, please do hurry"

So, I'd try mounting ext4 in data=writeback mode.  I can't make ext4
stall fsyncs on non-fsync IO locally and it looks like they have solved
the ext3 data=ordered problem.  But I still like to rule out old and
known issues before we dig into new things.

I'd also suggest something like the below patch which is entirely
untested and must be blessed by an actual ext4 developer.  I think we
can make fsync faster if we put the mutex locking down in the FS, but
until then it should be ok to drop the mutex while we are doing the
expensive log commits:

diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
index 592adf2..1b7a637 100644
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -114,6 +114,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	if (ext4_should_journal_data(inode))
 		return ext4_force_commit(inode->i_sb);
 
+	mutex_unlock(&inode->i_mutex);
 	commit_tid = datasync ? ei->i_datasync_tid : ei->i_sync_tid;
 	if (jbd2_log_start_commit(journal, commit_tid)) {
 		/*
@@ -133,5 +134,7 @@ int ext4_sync_file(struct file *file, int datasync)
 	} else if (journal->j_flags & JBD2_BARRIER)
 		blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL,
 			BLKDEV_IFL_WAIT);
+
+	mutex_lock(&inode->i_mutex);
 	return ret;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/