lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C9C8F02.5080005@redhat.com>
Date:	Fri, 24 Sep 2010 07:44:02 -0400
From:	Ric Wheeler <rwheeler@...hat.com>
To:	Andreas Dilger <adilger.kernel@...ger.ca>
CC:	djwong@...ibm.com, "Ted Ts'o" <tytso@....edu>,
	Mingming Cao <cmm@...ibm.com>,
	linux-ext4 <linux-ext4@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Keith Mannthey <kmannth@...ibm.com>,
	Mingming Cao <mcao@...ibm.com>, Tejun Heo <tj@...nel.org>,
	hch@....de
Subject: Re: Performance testing of various barrier reduction patches [was:
 Re: [RFC v4] ext4: Coordinate fsync requests]

  On 09/24/2010 02:24 AM, Andreas Dilger wrote:
> On 2010-09-23, at 17:25, Darrick J. Wong wrote:
>> To try to find an explanation, I started looking for connections between fsync delay values and average flush times.  I noticed that the setups with low (<  8ms) flush times exhibit better performance when fsync coordination is not attempted, and the setups with higher flush times exhibit better performance when fsync coordination happens.  This also is no surprise, as it seems perfectly reasonable that the more time consuming a flush is, the more desirous it is to spend a little time coordinating those flushes across CPUs.
>>
>> I think a reasonable next step would be to alter this patch so that ext4_sync_file always measures the duration of the flushes that it issues, but only enable the coordination steps if it detects the flushes taking more than about 8ms.  One thing I don't know for sure is whether 8ms is a result of 2*HZ (currently set to 250) or if 8ms is a hardware property.
> Note that the JBD/JBD2 code will already dynamically adjust the journal flush interval based on the delay seen when writing the journal commit block.  This was done to allow aggregating sync journal operations for slow devices, and allowing fast (no delay) sync on fast devices.  See jbd2_journal_stop() for details.
>
> I think the best approach is to just depend on the journal to do this sync aggregation, if at all possible, otherwise use the same mechanism in ext3/4 for fsync operations that do not involve the journal (e.g. nojournal mode, data sync in writeback mode, etc).
>
> Using any fixed threshold is the wrong approach, IMHO.
>
> Cheers, Andreas
>
>
>
>
>

I agree - we started on that dynamic batching when we noticed that single 
threaded writes to an array went at something like 720 files/sec (using fs_mark) 
and 2 threaded writes dropped down to 230 files/sec. That was directly 
attributed to the fixed (1 jiffie) wait we used to do.

Josef Bacik worked on the dynamic batching so we would not wait (sometimes 
much!) to batch other fsync/flushes into a transaction when it was faster just 
to dispatch them.

Related worry I have is that we have other places in the kernel that currently 
wait way too long for our current classes of devices....

Thanks,

Ric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ