linux-ext4 - Re: compilebench numbers for ext4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071025144355.583a8f88@think.oraclecorp.com>
Date:	Thu, 25 Oct 2007 14:43:55 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	"Jose R. Santos" <jrs@...ibm.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: compilebench numbers for ext4

On Thu, 25 Oct 2007 10:34:49 -0500
"Jose R. Santos" <jrs@...ibm.com> wrote:

> On Mon, 22 Oct 2007 19:31:04 -0400
> Chris Mason <chris.mason@...cle.com> wrote:
> 
> > Hello everyone,
> > 
> > I recently posted some performance numbers for Btrfs with different
> > blocksizes, and to help establish a baseline I did comparisons with
> > Ext3.
> > 
> > The graphs, numbers and a basic description of compilebench are
> > here:
> > 
> > http://oss.oracle.com/~mason/blocksizes/
> 
> I've been playing a bit with the workload and I have a couple of
> comments.
> 
> 1) I find the averaging of results at the end of the run misleading
> unless you run a high number of directories.  A single very good
> result due to page caching effects seems to skew the final results
> output. Have you considered providing output of the standard
> deviation of the data points as well in order to show how widely the
> results are spread. 

This is the main reason I keep the output from each run.  Stdev would
definitely help as well, I'll put it on the todo list.

> 
> 2) You mentioned that one of the goals of the benchmark is to measure
> locality during directory aging, but the workloads seems too well
> order to truly age the filesystem.  At least that's what I can gather
> from the output the benchmark spits out.  It may be that Im not
> understanding the relationship between INITIAL_DIRS and RUNS, but the
> workload seem to been localized to do operations on a single dir at a
> time.  Just wondering is this is truly stressing allocation algorithms
> in a significant or realistic way.

A good question.  compilebench has two modes, and the default is better
at aging then the run I graphed on ext4.  compilebench isn't trying to
fragment individual files, but it is instead trying to fragment
locality, and lower the overall performance of a directory tree.

In the default run, the patch, clean, and compile operations end up
changing around groups of files in a somewhat random fashion (at least
from the FS point of view).  But, it is still a workload where a good
FS should be able to maintain locality and provide consistent results
over time.

The ext4 numbers I sent here are from compilebench --makej, which is a
shorter and less complex run.  It has a few simple phases:

* create some number of kernel trees sequentially
* write new files into those trees in random order
* read a three of the trees
* delete all the trees

It is a very basic test that can give you a picture of directory
layout, writeback performance and overall locality.

> 
> If I understand how compilebench works, directories would be allocated
> with in one or two block group boundaries so the data and meta data
> would be in very close proximity.  I assume that doing random lookup
> through the entire file set would show some weakness in the ext3 meta
> data layout.

Probably.

> 
> I really want to use seekwatcher to test some of the stuff that I'm
> doing for flex_bg feature but it barfs on me in my test machine.
> 
> running :sleep 10:
> done running sleep 10
> Device: /dev/sdh
>   Total:                     0 events (dropped 0),     1368 KiB data
> blktrace done
> Traceback (most recent call last):
>   File "/usr/bin/seekwatcher", line 534, in ?
>     add_range(hist, step, start, size)
>   File "/usr/bin/seekwatcher", line 522, in add_range
>     val = hist[slot]
> IndexError: list index out of range

I don't think you have any events in the trace.  Try this instead:

echo 3 > /proc/sys/vm/drop_caches
seekwatcher -t find-trace -d /dev/xxxx -p 'find /usr/local -type f'

> 
> This is running on a PPC64/gentoo combination.  Dont know if this
> means anything to you.  I have a very basic algorithm for to take
> advantage block group metadata grouping and want be able to better
> visualize how different IO patterns take advantage or are hurt by the
> feature.

I wanted to benchmark flexbg too, but couldn't quite figure out the
correct patch combination ;)

> 
> > To match the ext4 numbers with Btrfs, I'd probably have to turn off
> > data checksumming...
> > 
> > But oddly enough I saw very bad ext4 read throughput even when
> > reading a single kernel tree (outside of compilebench).  The time
> > to read the tree was almost 2x ext3.  Have others seen similar
> > problems?
> > 
> > I think the ext4 delete times are so much better than ext3 because
> > this is a single threaded test.  delayed allocation is able to get
> > everything into a few extents, and these all end up in the inode.
> > So, the delete phase only needs to seek around in small directories
> > and seek to well grouped inodes.  ext3 probably had to seek all
> > over for the direct/indirect blocks.
> > 
> > So, tomorrow I'll run a few tests with delalloc and mballoc
> > independently, but if there are other numbers people are interested
> > in, please let me know.
> > 
> > (test box was a desktop machine with single sata drive, barriers
> > were not used).
> 
> More details please....
> 
> 1. CPU info (type, count, speed)

Dual core 3ghz x86-64

> 2. Memory info (mostly amount)

2GB

> 3. Disk info (partition size, disk rpms, interface, internal cache

SAMSUNG HD160JJ (sataII w/ncq), the FS was on a 40GB lvm volume.
Single spindle.

> size) 4. Benchmark cmdline parameters.

mkdir ext4
compilebench --makej -D /mnt -d /dev/mapper/xxxx -t ext4/trace -i 20 >&
ext4/out

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html