[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130619121020.GA13018@laptop.brq.redhat.com>
Date: Wed, 19 Jun 2013 14:10:20 +0200
From: Radek Pazdera <rpazdera@...hat.com>
To: Lukáš Czerner <lczerner@...hat.com>
Cc: Dave Chinner <david@...morbit.com>, linux-ext4@...r.kernel.org,
kasparek@....vutbr.cz
Subject: Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index
On Mon, Jun 17, 2013 at 10:58:35AM +0200, Lukáš Czerner wrote:
>On Sun, 16 Jun 2013, Dave Chinner wrote:
>
>> Date: Sun, 16 Jun 2013 10:55:33 +1000
>> From: Dave Chinner <david@...morbit.com>
>> To: Radek Pazdera <rpazdera@...hat.com>
>> Cc: linux-ext4@...r.kernel.org, lczerner@...hat.com, kasparek@....vutbr.cz
>> Subject: Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index
>>
>> On Sat, May 04, 2013 at 11:28:33PM +0200, Radek Pazdera wrote:
>> > Hello everyone,
>> >
>> > I am an university student from Brno /CZE/. I decided to try to optimise
>> > the readdir/stat scenario in ext4 as the final project to school. I
>> > posted some test results I got few months ago [1].
>> >
>> > I tried to implement an additional tree for ext4's directory index
>> > that would be sorted by inode numbers. The tree then would be used
>> > by ext4_readdir() which should lead to substantial increase of
>> > performance of operations that manipulate a whole directory at once.
>> >
>> > The performance increase should be visible especially with large
>> > directories or in case of low memory or cache pressure.
>> >
>> > This patch series is what I've got so far. I must say, I originally
>> > thought it would be *much* simpler :).
>> ....
>> > BENCHMARKS
>> > ==========
>> >
>> > I did some benchmarks and compared the performance with ext4/htree,
>> > XFS, and btrfs up to 5 000 000 of files in a single directory. Not
>> > all of them are done though (they run for days).
>>
>> Just a note that for users that have this sort of workload on XFS,
>> it is generally recommended that they increase the directory block
>> size to 8-16k (from the default of 4k). The saddle point where 8-16k
>> directory blocks tends to perform better than 4k directory blocks is
>> around the 2-3 million file point....
>>
>> Further, if you are doing random operations on such directories,
>> then increasing it to the maximum of 64k is recommended. This
>> greatly reduces the IO overhead of directory manipulations by making
>> the trees widers and shallower. i.e. we recommend trading off CPU
>> and memory for lower IO overhead and better layout on disk as it's
>> layout and IO that are the performance limiting factors for large
>> directories. :)
Hi Dave,
Thank you for pointing that out, I was not aware of that. I know that
the 5M tests may be a bit too extreme. I thought it might be interesting
to see what happens.
>> > Full results are available here:
>> > http://www.stud.fit.vutbr.cz/~xpazde00/soubory/ext4-5M/
>>
>> Can you publish the scripts you used so we can try to reproduce
>> your results?
>
>Hi Dave,
>
>IIRC the tests used to generate the results should be found here:
>
>https://github.com/astro-/dir-index-test
>
>however I am not entirely sure whether the github repository is kept
>up-to-date. Radek can you confirm ?
Lukas is right, these are the scripts I used to get the results above
and they're up-to-date.
If you'd like to run the tests, there are some parameters you will
probably need to adjust in the run_tests.sh file. Namely:
DEVICE - that's the testing device
DROP_OFF_DIR - this is a scratch dir for the copy test, which should
reside on a separate device
RESULTS_DIR - this is where you want your graphs to be stored
FILESYSTEMS - ext4, btrfs, jfs or xfs. If you would like to change the
parameters of mkfs, you can do it here:
https://github.com/astro-/dir-index-test/blob/master/scripts/prepfs.sh
FSIZES - the size of each file in the directory (if you provide a
list of values, the tests will be run multiple times with
different file sizes)
TEST_CASES - the readdir-stat and getdents-stat are just isolated
directory traversals (they are written in C)
https://github.com/astro-/dir-index-test/blob/master/src/readdir-stat.c
https://github.com/astro-/dir-index-test/blob/master/src/getdents-stat.c
The other tests are here:
https://github.com/astro-/dir-index-test/tree/master/tests
DIR_TYPE - clean or dirty (you will probably be interested in the
"dirty" type of tests). The difference can be seen here
(the create_clean_dir and create_dirty_dir functions):
https://github.com/astro-/dir-index-test/blob/master/scripts/create_files.py
DIR_SIZES - you can put a list of values here
To be able to run the tests properly, you need to have gnuplot installed.
If you have any questions or problems, please, let me know :).
Cheers,
Radek
>-Lukas
>
>>
>> > I also did some tests on an aged file system (I used the simple 0.8
>> > chance to create, 0.2 to delete a file) where the results of ext4
>> > with itree are much better even than xfs, which gets fragmented:
>> >
>> > http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/cp.png
>> > http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/readdir-stat.png
>>
>> This XFS result is of interest to me here - it shouldn't degrade
>> like that, so having the script to be able to reproduce it locally
>> would be helpful to me. Indeed, I posted a simple patch yesterday
>> that significantly improves XFS performance on a similar small file
>> create workload:
>>
>> http://marc.info/?l=linux-fsdevel&m=137126465712701&w=2
>>
>> That writeback plugging change should benefit ext4 as well in these
>> workloads....
>>
>> Cheers,
>>
>> Dave.
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists