lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100211195624.GM739@thunk.org>
Date:	Thu, 11 Feb 2010 14:56:24 -0500
From:	tytso@....edu
To:	Kailas Joshi <kailas.joshi@...il.com>
Cc:	linux-ext4@...r.kernel.org, Jan Kara <jack@...e.cz>,
	Jiaying Zhang <jiayingz@...gle.com>
Subject: Re: Help on Implementation of EXT3 type Ordered Mode in EXT4

On Thu, Feb 11, 2010 at 01:02:15PM +0530, Kailas Joshi wrote:
> 
> We are assessing the use of copy-on-write technique to provide data
> level consistency in EXT3/EXT4. We have implemented this in EXT3 by
> using the Ordered mode of operation. Benchmark results for IOZone and
> Postmark are quiet good. We could get the consistency equivalent to
> Journal mode with the overhead almost same as Ordered mode. However,
> there are few cases(for example, file rewrite) where performance of
> Journal mode is better than our technique. We think that in EXT4, with
> the support for delayed block allocation and extents, these problems
> can be removed.

Ah, sorry, I misread your initial post; I thouht you were trying to
reimplement the proposed ext4 mode data=guarded.

I've mostly given up on trying to get alloc_on_commit work, for two
reasons.

The first is that one of the reasons why you might be closing the
transaction is if there's not enough space left in the journal.  But
if we you going to a large number of data allocations at commit time,
there's no guaratee that there will be space in the journal for all of
the metadata blocks that might have to be modified in order to make
the block allocations.

The second problem with this scheme is a performance problem; while
you are doing handling delayed allocation blocks, you have to do this
while the journal is still locked, using magic handles that are
allowed to be created while the journal is locked.  That adds all
sorts of complexity, and that seems to what you are thinking about
doing.  The problem though is that while this is going on, all other
file system activity has to be blocked.  So this will cause all sorts
of processes to become suspended waiting for the all of the allocation
activity to complete, which may require bitmap allocation blocks to be
read into disk, etc.

The trade off for all of these problems is that it allows you to delay
the block allocation for only 5 seconds.  The question is, is this
worth it, compared with simply mounting the file system with
nodelalloc?  It may be all of this complexity doesn't produce enough
of a performance gain over simply using nodelalloc.

So maybe the solution for certain distributions that are catering to
the "inexperienced user" / "users who like to use unstable video
drivers" market is to mount with nodelalloc by default, and tell them
that if they want the performance improvements of delayed allocation,
they need to lobby to get the applications fixed.  

(After all, these problems are going to be around no matter whether
people use XFS or btrfs; most modern file systems are going to use
delayed allocation, so sooner or later the broken applications really
need to get fixed.  The defiant user's cry, "well, if you don't fix
this I'll switch to xfs/btrfs!" isn't going to help in this case....)

     	  	    		      - Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ