[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <99269303-4BAF-4977-A19E-EBF5BD7392DF@sun.com>
Date: Mon, 23 Nov 2009 14:45:38 -0700
From: Andreas Dilger <adilger@....com>
To: Frank Mayhar <fmayhar@...gle.com>
Cc: tytso@....edu, Curt Wohlgemuth <curtw@...gle.com>,
ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: Bug in extent zeroout: blocks not marked as new
On 2009-11-23, at 14:17, Frank Mayhar wrote:
> Finally, we have a question about the zero-out path: Is there any
> known, concrete improvement given by doing the zero-out as opposed
> to just continuing to split the extents? At the moment, by the way,
> there is one definite problem: Since it doesn't try to do a merge
> left (which it should) it invariably leaves a 14-block extent
> fragment, thus increasing fragmentation of the file. It's not a
> huge problem (since the extents are in fact contiguous) but it's
> there.
The intent is to avoid splitting the uninitialized extent further when
there is no longer any benefit to do so. Writing out 8kB vs. 64kB is
in the noise these days, but splitting the extent is extra overhead
(larger extent tree, more lookups, etc). If we were to continue
splitting it would leave smaller and smaller uninitialized extents.
As you point out, the newly-initialized extent should be merged with
its left neighbor. If we do the zero-out at the point where actually
writing zeroes is cost effective. At that point there is no longer an
uninitialized extent to track, and it can be merged entirely with its
left neighbor and avoid any extra overhead.
The other time we HAVE to zero out the uninitialized extent is if the
filesystem does not have any free blocks to add a new extent, so the
uninitialized extent is zeroed entirely and then converted to
initialized.
Using 64kB as the cutoff for uninitialized extents is very reasonable
these days, though we may in fact want to make that dynamic based on
the superblock s_raid_stripe_width for SSDs with 128kB erase blocks
and/or avoiding read-modify-write within a single RAID stripe.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists