lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <519A1926.4050408@redhat.com>
Date:	Mon, 20 May 2013 07:37:58 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	Andreas Dilger <adilger@...ger.ca>
CC:	"Theodore Ts'o" <tytso@....edu>,
	"frankcmoeller@...or.de" <frankcmoeller@...or.de>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: Ext4: Slow performance on first write after mount

On 5/20/13 1:39 AM, Andreas Dilger wrote:
> On 2013-05-19, at 8:00, Theodore Ts'o <tytso@....edu> wrote:
>> On Fri, May 17, 2013 at 06:51:23PM +0200, frankcmoeller@...or.de wrote:
>>> - Why do you throw away buffer cache and don't store it on disk during umount? The initialization of the buffer cache is quite awful for application which need a specific write throughput.
>>> - A workaround would be to read whole /proc/.../mb_groups file right after every mount. Correct?
>>
>> Simply adding "cat /proc/fs/<dev>/mb_groups > /dev/null" to one of the
>> /etc/init.d scripts, or to /etc/rc.local is probably the simplest fix,
>> yes.
>>
>>> - I can try to add a mount option to initialize the cache at mount time. Would you be interested in such a patch?
>>
>> Given the simple nature of the above workaround, it's not obvious to
>> me that trying to make file system format changes, or even adding a
>> new mount option, is really worth it.  This is especially true given
>> that mount -a is sequential so if there are a large number of big file
>> systems, using this as a mount option would be slow down the boot
>> significantly.  It would be better to do this parallel, which you
>> could do in userspace much more easily using the "cat
>> /proc/fs/<dev>/mb_groups" workaround.
> 
> Since we already have a thread starting at mount time to check the
> inode table zeroing, it would also be possible to co-opt this thread
> for preloading the group metadata from the bitmaps. 

Only up to a point, I hope; if the fs is so big that you start dropping the
first ones that were read, it'd be pointless.  So it'd need some nuance,
at the very least least.

How much memory are you willing to dedicate to this, and how much does
it really help long-term, given that it's not pinned in any way?

As long as we don't have efficiently-searchable on-disk freespace info
it seems like anything else is just a workaround, I'm afraid.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ