lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061025145450.GF21082@atrey.karlin.mff.cuni.cz>
Date:	Wed, 25 Oct 2006 16:54:50 +0200
From:	Jan Kara <jack@...e.cz>
To:	adilger@...sterfs.com, Theodore Tso <tytso@....edu>,
	David Chinner <dgc@....com>, Jeff Garzik <jeff@...zik.org>,
	Alex Tomas <alex@...sterfs.com>, Jan Kara <jack@...e.cz>,
	linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [RFC] Ext3 online defrag

> On Oct 24, 2006  15:44 -0400, Theodore Tso wrote:
> > First of all, we would need a way of allowing userpsace to specify
> > which blocks should be used in the preallocation.
> 
> Presumably it could do this in the same way it will be specifying
> which blocks to relocate in the defragger - by passing an extent.
> You would be required to pass the file offset for which to preallocate,
> and optionally an extent for the on-disk allocation itself (if none is
> supplied the kernel will allocate the best extent it can).
> 
> > Secondly, we would need a way of marking blocks as "preallocated but
> > not pre-zeroed"; otherwise we would have to zero out all of the blocks
> > in order to assure security (don't want userspace programs seeing the
> > previous contents of the data blocks), only to do the copy and the
> > extents vector swap.
> 
> This could be mitigated by having the preallocation be done (in the
> defragment case) against a temporary inode in the orphan list (as
> the initial patch did) so if there is a crash it will be released.
> The temporary inode will not be linked into the namespace so it cannot
> be read - only used to hold preallocation.  If this was a write-only
> file handle then we should be OK?
> 
> For defragger purposes this would need:
> 
> - "allocate new temporary inode" (VFS + fs, returns write-only fh if
>    fs can't properly handle uninitalized extents, or doesn't request
>    full-extent zeroing)
> 
>    for each extent to defragment {
> 	- "preallocate extents on temp inode" (fs specific internals)
> 	- "copy data from orig to temp at offset X" (VFS, splice or
> 	   e.g. sys_copyfile(src, dst, offset, count) which Linus agreed
> 	   to at KS '05 for network filesystems)
> 	- "migrate copied extent to original inode" (fs specific internals)
>    }
> 
> - "free temporary inode" (just close of temp fh, frees unmigrated extents).
  Yes, this sounds feasible. We could split the defrag ioctl into two
pieces (addition of given extent to a file and swapping of extents), which
can have generic interface... 

> I don't think this is much more work than implementing all of this
> functionality as part of a monolithic online defrag function, assuming
> we don't require full-file copies in order to do defrag.
  Yes, it's not more work than supporting swapping of extents in the
middle of the file. I've just not yet decided how to handle indirect
blocks in case of relocation in the middle of the file. Should they be
relocated or shouldn't they? Probably they should be relocated at least
in case they are fully contained in relocated interval or maybe better
said when all the blocks they reference to are also in the interval
(this handles also the case of EOF). But still if you would like to
relocate the file by parts this is not quite what you want (you won't be
able to relocate indirect blocks in the boundary of intervals) :(.

								Honza
-- 
Jan Kara <jack@...e.cz>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ