linux-ext4 - Re: [PATCH, RFC] fs: only call sync_filesystem() when remounting read-only

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140310114508.GA28107@xanadu.blop.info>
Date:	Mon, 10 Mar 2014 12:45:08 +0100
From:	Lucas Nussbaum <lucas.nussbaum@...ia.fr>
To:	Theodore Ts'o <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org,
	"linux-fsdevel@...r.kernel.org Emmanuel Jeanvoine" 
	<emmanuel.jeanvoine@...ia.fr>
Subject: Re: [PATCH, RFC] fs: only call sync_filesystem() when remounting
 read-only

On 08/03/14 at 11:08 -0500, Theodore Ts'o wrote:
> On Wed, Mar 05, 2014 at 03:13:43PM +0100, Lucas Nussbaum wrote:
> > TL;DR: we experience long temporary hangs when doing multiple mount -o
> > remount at the same time as other I/O on an ext4 filesystem.
> > 
> > When starting hundreds of LXC containers simultaneously on a system, the
> > boot of some containers was hanging. We tracked this down to an
> > initscript's use of mount -o remount, which was hanging in D state.
> > 
> > We reproduced the problem outside of LXC, with the script available at
> > [0]. That script initiates 1000 mount -o remount, and performs some
> > writes using a big cp to the same filesystem during the remounts....
> 
> +linux-fsdevel since the patch modifies fs/super.c
> 
> Lukas, can you try this patch?  I'm pretty sure this is what's going
> on.  It turns out each "mount -o remount" is implying an fsync(), so
> your test case is identical to copying a large file while having
> thousand of processes calling syncfs() on the file system, with the
> predictable results.

Hi Ted,

I can confirm that:
1) the patch solves my problem
2) issuing 'sync' instead of 'mount -o remount' indeed exhibits the
   problem again

However, I'm curious: why would such a workload (multiple syncfs()
initiated during a write) block for several minutes on an ext4
filesystem? I've just tried again on ext3, and it's not a problem in
that case.

Lucas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html