linux-ext4 - Re: Ext4: Slow performance on first write after mount

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <519A1926.4050408@redhat.com>
Date:	Mon, 20 May 2013 07:37:58 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	Andreas Dilger <adilger@...ger.ca>
CC:	"Theodore Ts'o" <tytso@....edu>,
	"frankcmoeller@...or.de" <frankcmoeller@...or.de>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: Ext4: Slow performance on first write after mount

On 5/20/13 1:39 AM, Andreas Dilger wrote:
> On 2013-05-19, at 8:00, Theodore Ts'o <tytso@....edu> wrote:
>> On Fri, May 17, 2013 at 06:51:23PM +0200, frankcmoeller@...or.de wrote:
>>> - Why do you throw away buffer cache and don't store it on disk during umount? The initialization of the buffer cache is quite awful for application which need a specific write throughput.
>>> - A workaround would be to read whole /proc/.../mb_groups file right after every mount. Correct?
>>
>> Simply adding "cat /proc/fs/<dev>/mb_groups > /dev/null" to one of the
>> /etc/init.d scripts, or to /etc/rc.local is probably the simplest fix,
>> yes.
>>
>>> - I can try to add a mount option to initialize the cache at mount time. Would you be interested in such a patch?
>>
>> Given the simple nature of the above workaround, it's not obvious to
>> me that trying to make file system format changes, or even adding a
>> new mount option, is really worth it.  This is especially true given
>> that mount -a is sequential so if there are a large number of big file
>> systems, using this as a mount option would be slow down the boot
>> significantly.  It would be better to do this parallel, which you
>> could do in userspace much more easily using the "cat
>> /proc/fs/<dev>/mb_groups" workaround.
> 
> Since we already have a thread starting at mount time to check the
> inode table zeroing, it would also be possible to co-opt this thread
> for preloading the group metadata from the bitmaps. 

Only up to a point, I hope; if the fs is so big that you start dropping the
first ones that were read, it'd be pointless.  So it'd need some nuance,
at the very least least.

How much memory are you willing to dedicate to this, and how much does
it really help long-term, given that it's not pinned in any way?

As long as we don't have efficiently-searchable on-disk freespace info
it seems like anything else is just a workaround, I'm afraid.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html