[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B37BA76.7050403@hp.com>
Date: Sun, 27 Dec 2009 14:50:14 -0500
From: jim owens <jowens@...com>
To: Christian Kujau <lists@...dbynature.de>
CC: Larry McVoy <lm@...mover.com>, tytso@....edu,
jfs-discussion@...ts.sourceforge.net, linux-nilfs@...r.kernel.org,
xfs@....sgi.com, reiserfs-devel@...r.kernel.org,
Peter Grandi <pg_jf2@....for.sabi.co.UK>,
ext-users <ext3-users@...hat.com>, linux-ext4@...r.kernel.org,
linux-btrfs@...r.kernel.org
Subject: Re: [Jfs-discussion] benchmark results
Christian Kujau wrote:
> On 26.12.09 08:00, jim owens wrote:
>>> I was using "sync" to make sure that the data "should" be on the disks
>> Good, but not good enough for many tests... info sync
> [...]
>> On Linux, sync is only guaranteed to schedule the dirty blocks for
>> writing; it can actually take a short time before all the blocks are
>> finally written.
OK, that was wrong per Ted's explanation:
>
> But for quite some time, under Linux the sync(2) system call will wait
> for the blocks to be flushed out to HBA, although we currently don't
> wait for the blocks to have been committed to the platters (at least
> not for all file systems).
But Christian Kujau wrote:
> Noted, many times already. That's why I wrote "should be" - but in this
> special scenario (filesystem speed tests) I don't care for file
> integrity: if I pull the plug after "sync" and some data didn't make it
> to the disks, I'll only look if the testscript got all the timestamps
> and move on to the next test. I'm not testing for "filesystem integrity
> after someone pulls the plug" here. And remember, I'm doing "sync" for
> all the filesystems tested, so the comparison still stands.
You did not understand my point. It was not about data integrity,
it was about test timing validity.
And even with sync(2) behaving as Ted describes, *timing* may still
tell you the wrong thing or not tell you something important.
I have a battery-backed HBA cache. Writes are HBA cached. Timing only
shows "to HBA memory". So 1000 pages (4MB total) that are at 1000 places
on the disk will time (almost) the same completion as 1000 pages that
are in 200 extents of 50 pages each. Writing to disk the time difference
between these would be an obvious slap upside the head.
Hardware caches can trick you into thinking a filesystem performs
much better than it really does for some operations. Or trick you
about relative performance between 2 filesystems.
And I don't even care about comparing 2 filesystems, I only care about
timing 2 versions of code in the single filesystem I am working on,
and forgetting about hardware cache effects has screwed me there.
So unless you are sure you have no hardware cache effects...
"the comparison still stands" is *false*.
jim
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists