[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121204192407.GB7790@thunk.org>
Date: Tue, 4 Dec 2012 14:24:07 -0500
From: Theodore Ts'o <tytso@....edu>
To: Darren Hart <dvhart@...radead.org>
Cc: Andreas Dilger <adilger@...ger.ca>,
linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: [e2fsprogs] initdir: Writing inode after the initial write?
On Tue, Dec 04, 2012 at 09:46:06AM -0800, Darren Hart wrote:
>
> I think what I'm reading here is that if you care about having a
> filesystem that makes hardware specific optimizations, you're better off
> mounting the device and copying the filesystem over. In that case, plan
> on needing root access.
Well, ext4 currently doesn't optimize for erase block alignment
either.
If I had the free time, and it was something that I could work on on
$DAYJOB time, here are some projects that I've been thinking about:
1) Add support for erase block alignment using the same mechanism
we've been planning for RAID 5 stripe alignment.
2) Add either a superblock flag or a mount option which adds an eMMC
block allocation algorithm which would add support for more aggressive
optimizations.
3) Allow a zero length file to have its extent flag switch to be
turned off (so it would be using the old indirect block scheme).
4) If a file has the extent flag turned off, and the eMMC block
allocation algorithm is enabled, and the workload appears to be doing
random overwrites, implement data block copy-on-write. (That is,
allocate a new block and then update the indirect block to point to
the new block.)
5) If the eMMC block allocation algorithm is enabled, teach the block
allocator to aggressively allocate contiguous physical blocks
(initially aligned on an erase block) regardless of whether of what
the logical block number is, since with flash seeks are essentially
free, and with indirect blocks we don't care about extent
fragmentation.
The last two are a little bit complicated, but I'm certain we could
implement and stablize it faster than f2fs can be stablized. (See
previous discussions regarding how confident btrfs people were that
they could stablize it more quickly than all previous experience with
gpfs, jfs, advfs, zfs, etc., because, well, Open Source Is Different.
If anyone at Linaro is interested in trying their hand on some kernel
file system work, they should contact me. :-)
- Ted
P.S. I still think part of the right answer is to investigate replace
sqlite with something like OpenLDAP's mdb --- which has a drop-in
replacement sqlite API shim layer BTW --- and which beats the pants
off of sqlite's performance without requiring kernel-level changes,
but given that people seem wedded to sqlite....
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists