linux-kernel - Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <462E7C47.8080604@ksu.edu>
Date:	Tue, 24 Apr 2007 16:53:11 -0500
From:	Amit Gud <gud@....edu>
To:	Nikita Danilov <nikita@...sterfs.com>
CC:	David Lang <david.lang@...italinsight.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	val_henson@...ux.intel.com, riel@...riel.com, zab@...bo.net,
	arjan@...radead.org, suparna@...ibm.com, brandon@...p.org,
	karunasagark@...il.com
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

Nikita Danilov wrote:
> Maybe I failed to describe the problem presicely.
> 
> Suppose that all chunks have been checked. After that, for every inode
> I0 having continuations I1, I2, ... In, one has to check that every
> logical block is presented in at most one of these inodes. For this one
> has to read I0, with all its indirect (double-indirect, triple-indirect)
> blocks, then read I1 with all its indirect blocks, etc. And to repeat
> this for every inode with continuations.
> 
> In the worst case (every inode has a continuation in every chunk) this
> obviously is as bad as un-chunked fsck. But even in the average case,
> total amount of io necessary for this operation is proportional to the
> _total_ file system size, rather than to the chunk size.
> 

Perhaps, I should talk about how continuation inodes are managed / 
located on disk. (This is how it is in my current implementation)

Right now, there is no distinction between an inode and continuation 
inode (also referred to as 'cnode' below), except for the 
EXT2_IS_CONT_FL flag. Every inode holds a list of static number of 
inodes, currently limited to 4.

The structure looks like this:

  ----------		----------
| cnode 0  |---------->| cnode 0  |----------> to another cnode or NULL
  ----------		----------
| cnode 1  |-----      | cnode 1  |-----
  ----------	|	----------	|
| cnode 2  |--	|      | cnode 2  |--   |
  ----------  |	|	----------  |   |
| cnode 3  | |	|      | cnode 3  | |   |
  ----------  |	|	----------  |   |
	  |  |  |		 |  |   |

	   inodes		inodes or NULL

I.e. only first cnode in the list carries forward the chain if all the 
slots are occupied.

Every cnode# field contains
{
	ino_t cnode;
	__u32 start;	/* starting logical block number */
	__u32 end;	/* ending logical block number */
}

(current implementation has just one field: cnode)

I thought of this structure to avoid recursion and / or use of any data 
structure while traversing the continuation inodes.

Additional flag, EXT2_SPARSE_CONT_FL would indicate whether the inode 
has any sparse portions. 'start' and 'end' fields are used to speed-up 
finding a cnode given a logical block number without the need of 
actually reading the inode - this can be done away with, perhaps more 
conveniently by, pinning the cnodes in memory as and when read.

Now going back to the Nikita's question, all the cnodes for an inode 
need to be scanned iff 'start' field or number of blocks or flag 
EXT2_SPARSE_CONT_FL in any of its cnodes is altered.

And yes, the whole attempt is to reduce the number of continuation inodes.

Comments, suggestions welcome.

AG
-- 
May the source be with you.
http://www.cis.ksu.edu/~gud

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/