lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100706185548.GA26677@thunk.org>
Date:	Tue, 6 Jul 2010 14:55:48 -0400
From:	tytso@....edu
To:	Daniel Taylor <Daniel.Taylor@....com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: inconsistent file placement

On Mon, Jul 05, 2010 at 06:49:34PM -0700, Daniel Taylor wrote:
> I realize that it is enerally not a good idea to tune
> an operating system, or subsystem, for benchmarking, but
> there's something that I don't understand about ext[234]
> that is badly affecting our product.  File placement on
> newly-created file systems is inconsistent.  I can't,
> yet, call it a bug, but I really need to understand what
> is happening, and I cannot find, in the source code, the
> source of the randomization (related to "goal"???).

In ext3, it really is random.  The randomness you're looking for can
be found in fs/ext3/ialloc.c:find_group_orlov(), when it calls
get_random_bytes().  This is responsible for "spreading" directories
so they are spread across the block groups, to try to prevent
fragmented files.  Yes, if all you care about is benchmarks which only
use 10% of the entire file system, and for which the benchmarks don't
adequately simulate file system aging, the algorithms in ext3 will
cause a lot of variability.

Yes, if you use FAT-style algorithms which try to use the first free
inode, and first free block which is available, for the purposes of
competitive benchmarking (especially if the benchmarks are crap), you
can probably win against the competition.  Unfortunately, long-term
your product will probably far more likely to suffer from file system
aging as the blocks at the beginning of the file system are badly
fragmented.  Please don't do that, though (or, if you must, please
have a switch so that users can switch it from "competitive
benchmarking mode" to "friendly to real life users" mode).

Ext4 uses very different algorithms, and it's not strictly speaking
random since it uses a cur-down md4 hash of the directory name to
decide where to place the directory inode (and the location of the
directory inode, affects both the files created in that inode as well
as the blocks allocated to those files, as in ext3).  So as long as
the directory hash seed in the superblock stays constant, and the
directory and file names created stay constant, the inode and block
layout will also be consistent.

All of this having been said, it may very well be possible to improve
on the anti-fragmentation algorithms while still trying to allocate
block groups closer to the beginning of the disk to take advantage of
the inner-diamater/outer-diameter placement effect.  There's probably
room for some research work here.  But please do be careful before
twiddling too much with the allocator algorithms, they are somewhat
subtle....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ