lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.63.0704241222110.7701@qynat.qvtvafvgr.pbz>
Date:	Tue, 24 Apr 2007 12:26:32 -0700 (PDT)
From:	David Lang <david.lang@...italinsight.com>
To:	Nikita Danilov <nikita@...sterfs.com>
cc:	Amit Gud <gud@....ksu.edu>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, val_henson@...ux.intel.com,
	riel@...riel.com, zab@...bo.net, arjan@...radead.org,
	suparna@...ibm.com, brandon@...p.org, karunasagark@...il.com,
	gud@....edu
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck

On Tue, 24 Apr 2007, Nikita Danilov wrote:

> David Lang writes:
> > On Tue, 24 Apr 2007, Nikita Danilov wrote:
> >
> > > Amit Gud writes:
> > >
> > > Hello,
> > >
> > > >
> > > > This is an initial implementation of ChunkFS technique, briefly discussed
> > > > at: http://lwn.net/Articles/190222 and
> > > > http://cis.ksu.edu/~gud/docs/chunkfs-hotdep-val-arjan-gud-zach.pdf
> > >
> > > I have a couple of questions about chunkfs repair process.
> > >
> > > First, as I understand it, each continuation inode is a sparse file,
> > > mapping some subset of logical file blocks into block numbers. Then it
> > > seems, that during "final phase" fsck has to check that these partial
> > > mappings are consistent, for example, that no two different continuation
> > > inodes for a given file contain a block number for the same offset. This
> > > check requires scan of all chunks (rather than of only "active during
> > > crash"), which seems to return us back to the scalability problem
> > > chunkfs tries to address.
> >
> > not quite.
> >
> > this checking is a O(n^2) or worse problem, and it can eat a lot of memory in
> > the process. with chunkfs you divide the problem by a large constant (100 or
> > more) for the checks of individual chunks. after those are done then the final
> > pass checking the cross-chunk links doesn't have to keep track of everything, it
> > only needs to check those links and what they point to
>
> Maybe I failed to describe the problem presicely.
>
> Suppose that all chunks have been checked. After that, for every inode
> I0 having continuations I1, I2, ... In, one has to check that every
> logical block is presented in at most one of these inodes. For this one
> has to read I0, with all its indirect (double-indirect, triple-indirect)
> blocks, then read I1 with all its indirect blocks, etc. And to repeat
> this for every inode with continuations.
>
> In the worst case (every inode has a continuation in every chunk) this
> obviously is as bad as un-chunked fsck. But even in the average case,
> total amount of io necessary for this operation is proportional to the
> _total_ file system size, rather than to the chunk size.

actually, it should be proportional to the number of continuation nodes. The 
expectation (and design) is that they are rare.

If you get into the worst-case situation of all of them being continuation 
nodes, then you are actually worse off then you were to start with (as you are 
saying), but numbers from people's real filesystems (assuming a chunk size equal 
to a block cluster size) indicates that we are more on the order of a fraction 
of a percent of the nodes. and the expectation is that since the chunk sizes 
will be substantially larger then the block cluster sizes this should get 
reduced even more.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ