linux-kernel - Re: [PATCH v6 0/4] ext4: Coordinate data-only flush requests sent by fsync

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101130001942.GF18195@tux1.beaverton.ibm.com>
Date:	Mon, 29 Nov 2010 16:19:42 -0800
From:	"Darrick J. Wong" <djwong@...ibm.com>
To:	Ric Wheeler <rwheeler@...hat.com>
Cc:	Jens Axboe <axboe@...nel.dk>, "Theodore Ts'o" <tytso@....edu>,
	Neil Brown <neilb@...e.de>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Alasdair G Kergon <agk@...hat.com>, Jan Kara <jack@...e.cz>,
	Mike Snitzer <snitzer@...hat.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-raid@...r.kernel.org, Keith Mannthey <kmannth@...ibm.com>,
	dm-devel@...hat.com, Mingming Cao <cmm@...ibm.com>,
	Tejun Heo <tj@...nel.org>, linux-ext4@...r.kernel.org,
	Christoph Hellwig <hch@....de>, Josef Bacik <josef@...hat.com>
Subject: Re: [PATCH v6 0/4] ext4: Coordinate data-only flush requests sent
	by fsync

On Mon, Nov 29, 2010 at 06:48:25PM -0500, Ric Wheeler wrote:
> On 11/29/2010 05:05 PM, Darrick J. Wong wrote:
>> On certain types of hardware, issuing a write cache flush takes a considerable
>> amount of time.  Typically, these are simple storage systems with write cache
<snip>
>> lowered performance considerably, especially in the case where directio was in
>> use.  Therefore, this patch adds the coordination code directly to ext4.
>
> Hi Darrick,
>
> Just curious why we would need to have batching in both places? Doesn't 
> your patch set make the jbd2 transaction batching redundant?

The code path that I'm changing is only executed when ext4_sync_file determines
that the flush can't go through the journal, i.e. whenever the previous
sequence of data writes hasn't resulted in any metadata updates, or if the
transaction that went with the previous writes has already been committed.

> I noticed that the patches have a default delay and a mount option to 
> override that default. The jbd2 code today tries to measure the average 
> time needed in a transaction and automatically tune itself. Can't we do 
> something similar with your patch set? (I hate to see yet another mount 
> option added!)

The mount option is no longer the delay time, as it was in previous patches.
In the (unreleased) v5 patch, the code automatically tuned the delay based on
the average flush time.  However, we then observed very low flush times (< 2ms)
and about a 6% regression on our arrays with battery-backed write cache, so the
auto-tune code was then adapted in v6 to skip the coordination if the average
flush time falls below that threshold, as it does on our arrays.

Therefore, the new mount option exists to override the default threshold.

--D
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/