[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <913B5E53-3552-4D39-B8D3-5598A5D28712@sun.com>
Date: Tue, 08 Dec 2009 11:24:07 -0700
From: Andreas Dilger <adilger@....com>
To: Vyacheslav Dubeyko <Vyacheslav.Dubeyko@...onis.com>
Cc: "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: About reserve of blocks for "overflow extents" in ext4 metadata
On 2009-12-08, at 03:03, Vyacheslav Dubeyko wrote:
> I think that it make sense to has in ext4 metadata a reserve of
> blocks for "overflow extents" (it is the extents that to form
> extent's tree and it is placed in some blocks is described in
> i_block inode's field for a file). The reserve of blocks for
> "overflow extents" can be located (during operation of ext4 file
> system creation by mkfs) after inode table for every virtual
> (FLEX_BG) group by united aggregate of blocks. The size and
> placement of this reserve has to be described by free special inode.
>
> In my opinion, the reserve of blocks for "overflow extents" resolves
> such problems:
> 1) In the case of ext4 volume's shrinking resize (especially, in the
> case of very fragmented volume) it can be very difficult to estimate
> possibility of successful resize because of existing mechanism of
> extents' tree layout on the volume. It is possible to encounter
> during resize the problem of free blocks' lack for rebuilding of
> extents' tree for replaced files. The reserve of blocks for
> "overflow extents" guarantee against encountering of such problem
> during resizes.
> 2) The presence of the reserve of blocks for "overflow extents"
> means that all existing extents' trees of files will locate in one
> place. This fact and placement the reserve just after inode table
> will increase efficiency of operations with extents' trees, in my
> opinion.
> 3) The localized layout of extents' trees of files means efficient
> journaling of this metadata, also.
In fact, for most files the 4 extents that can be stored within the
inode itself provide enough space to store all of the extents of the
file. Reserving extra space is generally sub-optimal, either because
it wastes space when too many blocks are reserved (causing ENOSPC
before it is needed), or when too few blocks are reserved it will
cause the same failures as you report today.
I wouldn't object to tuning the block allocator to pack index and
extent blocks into shared (in-memory) preallocated regions, but I
don't think that needs to be a hard reservation. The mballoc code
already has the concept of aggregating small IOs into a single free
chunk, and it makes sense to put the index/extent blocks together in
this way, to avoid seeking during e2fsck, and to avoid fragmenting the
free space with small allocations.
In fact, I thought Ted had done some work in this area already?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists