[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131002170237.GB16076@kvack.org>
Date: Wed, 2 Oct 2013 13:02:37 -0400
From: Benjamin LaHaise <bcrl@...ck.org>
To: Theodore Ts'o <tytso@....edu>
Cc: Eric Sandeen <sandeen@...hat.com>, Jan Kara <jack@...e.cz>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: add noorlov parameter to avoid spreading of directory inodes
On Wed, Oct 02, 2013 at 12:23:23PM -0400, Theodore Ts'o wrote:
> Ext3 used an orlov style allocator as well. The main difference
> between ext4 and ext3 is the orlov allocator is now done on a
> per-flexbg basis instead of per-blockgroup basis.
>
> That is, we do the statistics based on a flex-bg basis instead of the
> blockgroup basis. As a result, I suspect Ben would see the inode
> allocation behavior equivalent to ext3 if he creates the file system
> using "mke2fs -t ext4 -G 1" to force the flex_bg size to 1.
>
> Can you let me know what the size of the file system was, and mke2fs
> parameters you were using for ext3 and ext4? I have a feeling that
> inode allocations weren't optimal for your use case even with ext3,
> but because we now spread the inodes based on flex_bg's instead of
> block groups, that's why you saw the performance degredation.
This may have been a bit misleading -- other parts of the system changed
between the version running on ext3 vs ext4. Subdirectories weren't used
as much on ext3 as on ext4, so the effect wasn't nearly as pronounced.
It was on further investigation that showed that the spreading of inodes
for directories was resulting in the files being laid out in different
block groups, which made the operation of reading/writing files to disk
much less sequential.
The other big change in allocation between ext3 and ext4 is mballoc.
Without fallocate() on the files, the allocator in ext4 was preferentially
aligning files to power-of-2 block numbers. This lead to one of our
tests where ~9MB files were used to have gaps of ~1800 blocks between
files (even in the same directory), which degraded transfer rates to/from
disk thanks to the extra seeks. But this aspect of tweaking the allocator
was easily fixed by doing an fallocate() for the size of the file before
writing to it.
-ben
--
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists