[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <462E7C47.8080604@ksu.edu>
Date: Tue, 24 Apr 2007 16:53:11 -0500
From: Amit Gud <gud@....edu>
To: Nikita Danilov <nikita@...sterfs.com>
CC: David Lang <david.lang@...italinsight.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
val_henson@...ux.intel.com, riel@...riel.com, zab@...bo.net,
arjan@...radead.org, suparna@...ibm.com, brandon@...p.org,
karunasagark@...il.com
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck
Nikita Danilov wrote:
> Maybe I failed to describe the problem presicely.
>
> Suppose that all chunks have been checked. After that, for every inode
> I0 having continuations I1, I2, ... In, one has to check that every
> logical block is presented in at most one of these inodes. For this one
> has to read I0, with all its indirect (double-indirect, triple-indirect)
> blocks, then read I1 with all its indirect blocks, etc. And to repeat
> this for every inode with continuations.
>
> In the worst case (every inode has a continuation in every chunk) this
> obviously is as bad as un-chunked fsck. But even in the average case,
> total amount of io necessary for this operation is proportional to the
> _total_ file system size, rather than to the chunk size.
>
Perhaps, I should talk about how continuation inodes are managed /
located on disk. (This is how it is in my current implementation)
Right now, there is no distinction between an inode and continuation
inode (also referred to as 'cnode' below), except for the
EXT2_IS_CONT_FL flag. Every inode holds a list of static number of
inodes, currently limited to 4.
The structure looks like this:
---------- ----------
| cnode 0 |---------->| cnode 0 |----------> to another cnode or NULL
---------- ----------
| cnode 1 |----- | cnode 1 |-----
---------- | ---------- |
| cnode 2 |-- | | cnode 2 |-- |
---------- | | ---------- | |
| cnode 3 | | | | cnode 3 | | |
---------- | | ---------- | |
| | | | | |
inodes inodes or NULL
I.e. only first cnode in the list carries forward the chain if all the
slots are occupied.
Every cnode# field contains
{
ino_t cnode;
__u32 start; /* starting logical block number */
__u32 end; /* ending logical block number */
}
(current implementation has just one field: cnode)
I thought of this structure to avoid recursion and / or use of any data
structure while traversing the continuation inodes.
Additional flag, EXT2_SPARSE_CONT_FL would indicate whether the inode
has any sparse portions. 'start' and 'end' fields are used to speed-up
finding a cnode given a logical block number without the need of
actually reading the inode - this can be done away with, perhaps more
conveniently by, pinning the cnodes in memory as and when read.
Now going back to the Nikita's question, all the cnodes for an inode
need to be scanned iff 'start' field or number of blocks or flag
EXT2_SPARSE_CONT_FL in any of its cnodes is altered.
And yes, the whole attempt is to reduce the number of continuation inodes.
Comments, suggestions welcome.
AG
--
May the source be with you.
http://www.cis.ksu.edu/~gud
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists