lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 30 Jun 2010 13:05:16 -0600
From:	Andreas Dilger <adilger@...ger.ca>
To:	Ric Wheeler <rwheeler@...hat.com>
Cc:	tytso@....edu, Christoph Hellwig <hch@...radead.org>,
	Mingming Cao <cmm@...ibm.com>, djwong@...ibm.com,
	linux-ext4 <linux-ext4@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Keith Mannthey <kmannth@...ibm.com>,
	Mingming Cao <mcao@...ibm.com>
Subject: Re: [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages.

On 2010-06-30, at 07:54, Ric Wheeler wrote:
> On 06/30/2010 09:44 AM, tytso@....edu wrote:
>> We track whether or not there is any metadata updates associated with
>> the inode already; if it does, we force a journal commit, and this
>> implies a barrier operation.
>> 
>> The case we're talking about here is one where either (a) there is no
>> journal, or (b) there have been no metadata updates (I'm simplifying a
>> little here; in fact we track whether there have been fdatasync()- vs
>> fsync()- worthy metadata updates), and so there hasn't been a journal
>> commit to do the cache flush.
>> 
>> In this case, we want to track when is the last time an fsync() has
>> been issued, versus when was the last time data blocks for a
>> particular inode have been pushed out to disk.
> 
> I think that the state that we want to track is the last time the write cache on the target device has been flushed. If the last fsync() did do a full barrier, that would be equivalent :-)

We had a similar problem in Lustre, where we want to ensure the integrity of some data on disk, but don't want to force an extra journal commit/barrier if there was already one since the time the write was submitted and before we need it to be on disk.

We fixed this in a similar manner but it is optimized somewhat.  In your case there is a flag on the inode in question, but you should also registered a journal commit callback after the IO has been submitted that clears the flag when the journal commits (which also implies a barrier).  This avoids a gratuitous barrier if fsync() is called on this (or any other similarly marked) inode after the journal has already issued the barrier.

The best part is that this gives "POSIXly correct" semantics for applications that are issuing the f{,data}sync() on the modified files, without penalizing them again if the journal happened to do this already in the background in aggregate.

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ