lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061026063648.GE8394166@melbourne.sgi.com>
Date:	Thu, 26 Oct 2006 16:36:48 +1000
From:	David Chinner <dgc@....com>
To:	Theodore Tso <tytso@....edu>
Cc:	David Chinner <dgc@....com>, Jeff Garzik <jeff@...zik.org>,
	Barry Naujok <bnaujok@...bourne.sgi.com>,
	"'Dave Kleikamp'" <shaggy@...tin.ibm.com>,
	"'Alex Tomas'" <alex@...sterfs.com>, "'Jan Kara'" <jack@...e.cz>,
	linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [RFC] Ext3 online defrag

On Wed, Oct 25, 2006 at 11:33:16PM -0400, Theodore Tso wrote:
> On Thu, Oct 26, 2006 at 11:40:20AM +1000, David Chinner wrote:
> > We don't need to expose anything filesystem specific to userspace to
> > implement this.  Online data movement (i.e. the defrag mechanism)
> > becomes something like:
> > 
> > 	do {
> > 		get_free_list(dst_fd, location, len, list)
> > 		/* select extent to use */
> > 		alloc_from_list(dst_fd, list[X], off, len)
> > 	} while (ENOALLOC)
> > 	move_data(src_fd, dst_fd, off, len);
> > 
> > And this would work on any filesystem type that implemented these
> > interfaces. Hence tools like a startup file optimiser would
> > only need to be written once, rather than needing a different
> > tool for every different filesystem type.....
> 
> Yeah, but that's simply not enough. 

Not enough for what?

> A good defragger needs to know

Oh, we're back to defrag again. :/

> about a filesystem's allocation policies, and move files so they are
> optimally located, given the filesystem layout.  For example, in
> ext2/3/4 we will want to move blocks so they in the same block group
> as the inode.  That's filesystem specific information; other
> filesystems will require different policies.

Of which a good chunk of policies will be common. the above policy
has been around for many, many years and is implemented in many, many
filesystems (even XFS).

> > 		get_free_list(dst_fd, location, len, list)

location == allocation policy. e.g: give me a list of free blocks:

	- anywhere (default filesystem policy applies)
	- near block number X
	- at block X
	- in block/allocation group Y
	- of the largest contiguous regions in (one of the above)
	- at least N blocks in length
	- near inode src_fd
	- in storage tier 3
	
then you select one of the regions that was returned at attempt
to allocate that.

You can put whatever filesystems specific stuff you need around this
to arrive at the decision of where to put the file, but you've got
to allocate the new blocks, move the data to them, and swap them
over. Every defragger needs to do this, regardless of the filesystem
type. So why not provide a framework for it, especially as the
framework is useful for far more than just as the data movement part
of a defrag application.

> > Remember, I'm not just talking about defrag - I'm talking about
> > an interface that is actually useful to apps that might care
> > about how data is laid out on disk but the applications writers
> > don't know anyhting about how filesystem X or Y or Z is
> > implemented. Putting the burden of learning about fileystem
> > internals on application developers is not the correct solution.
> 
> Unfortunately, if you want to do a good job, a defragger *has* to know
> about some very low-level filesystem specific information, if it wants
> to do a good job.

Back to defrag. Again. Bigger picture, guys, bigger picture.....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ