linux-kernel - Re: Performance testing of various barrier reduction patches [was: Re: [RFC v4] ext4: Coordinate fsync requests]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101011202020.GF25624@tux1.beaverton.ibm.com>
Date:	Mon, 11 Oct 2010 13:20:20 -0700
From:	"Darrick J. Wong" <djwong@...ibm.com>
To:	Ric Wheeler <rwheeler@...hat.com>
Cc:	Andreas Dilger <adilger.kernel@...ger.ca>,
	"Ted Ts'o" <tytso@....edu>, Mingming Cao <cmm@...ibm.com>,
	linux-ext4 <linux-ext4@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Keith Mannthey <kmannth@...ibm.com>,
	Mingming Cao <mcao@...ibm.com>, Tejun Heo <tj@...nel.org>,
	hch@....de, Josef Bacik <josef@...hat.com>,
	Mike Snitzer <snitzer@...hat.com>
Subject: Re: Performance testing of various barrier reduction patches [was:
	Re: [RFC v4] ext4: Coordinate fsync requests]

On Fri, Oct 08, 2010 at 05:56:12PM -0400, Ric Wheeler wrote:
>  On 10/08/2010 05:26 PM, Darrick J. Wong wrote:
>> On Mon, Sep 27, 2010 at 04:01:11PM -0700, Darrick J. Wong wrote:
>>> Other than those regressions, the jbd2 fsync coordination is about as fast as
>>> sending the flush directly from ext4.  Unfortunately, where there _are_
>>> regressions they seem rather large, which makes this approach (as implemented,
>>> anyway) less attractive.  Perhaps there is a better way to do it?
>> Hmm, not much chatter for two weeks.  Either I've confused everyone with the
>> humongous spreadsheet, or ... something?
>>
>> I've performed some more extensive performance and safety testing with the
>> fsync coordination patch.  The results have been merged into the spreadsheet
>> that I linked to in the last email, though in general the results have not
>> really changed much at all.
>>
>> I see two trends happening here with regards to comparing the use of jbd2 to
>> coordinate the flushes vs. measuring and coodinating flushes directly in ext4.
>> The first is that for loads that most benefit from having any kind of fsync
>> coordination (i.e. storage with slow flushes), the jbd2 approach provides the
>> same or slightly better performance than the direct approach.  However, for
>> storage with fast flushes, the jbd2 approach seems to cause major slowdowns
>> even compared to not changing any code at all.  To me this would suggest that
>> ext4 needs to coordinate the fsyncs directly, even at a higher code maintenance
>> cost, because a huge performance regression isn't good.
>>
>> Other people in my group have been running their own performance comparisons
>> between no-coordination, jbd2-coordination, and direct-coordination, and what
>> I'm hearing is tha the direct-coordination mode is slightly faster than jbd2
>> coordination, though either are better than no coordination at all.  Happily, I
>> haven't seen an increase in fsck complaints in my poweroff testing.
>>
>> Given the nearness of the merge window, perhaps we ought to discuss this on
>> Monday's ext4 call?  In the meantime I'll clean up the fsync coordination patch
>> so that it doesn't have so many debugging knobs and whistles.
>>
>> Thanks,
>>
>> --D
>
> Hi Darrick,
>
> We have been busily testing various combinations at Red Hat (we being not 
> me :)), but here is one test that we used a long time back to validate 
> the batching impact.
>
> You need a slow, poky S-ATA drive - the slower it spins, the better.
>
> A single fs_mark run against that drive should drive some modest number 
> of files/sec with 1 thread:
>
>
> [root@...kums /]# fs_mark -s 20480 -n 500 -L 5 -d /test/foo
>
> On my disk, I see:
>
>      5          500        20480         31.8             6213
>
> Now run with 4 threads to give the code a chance to coalesce.
>
> On my box, I see it jump up:
>
>      5         2000        20480        113.0            25092
>
> And at 8 threads it jumps again:
>
>      5         4000        20480        179.0            49480
>
> This work load is very device specific. On a very low latency device 
> (arrays, high performance SSD), the coalescing "wait" time could be 
> slower than just dispatching the command. Ext3/4 work done by Josef a few 
> years back was meant to use high res timers to dynamically adjust that 
> wait to avoid slowing down.

Yeah, elm3c65 and elm3c75 in that spreadsheet are a new pokey SATA disk and a
really old IDE disk, which ought to represent the low end case.  elm3c44-sas is
a midrange storage server... which doesn't like the patch so much.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/