[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090827011942.GA10541@lst.de>
Date: Thu, 27 Aug 2009 03:19:43 +0200
From: Christoph Hellwig <hch@....de>
To: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: chris.mason@...cle.com, jack@...e.cz, tytso@....edu,
adilger@....com, swhiteho@...hat.com,
konishi.ryusuke@....ntt.co.jp, mfasheh@...e.com,
joel.becker@...cle.com
Subject: Re: [PATCH] notes on volatile write caches vs fdatasync
No actually a patch, sorry ;-)
On Thu, Aug 27, 2009 at 03:16:24AM +0200, Christoph Hellwig wrote:
> There are two related issues when dealing with volatile write caches,
> the popular and beaten to death one are write barriers to guarantee
> write ordering and stable storage for log writes. For this post
> I assume naively this works perfectly for all filesystems supporting it.
>
> The second issue are plain cache flush. Yes, they happen to be the
> base for the barrier implementation on all common disks in Linux, but
> there are cases where we need to issue them even without a log barrier.
>
> Think about a plain write into a file that is already fully allocated.
> Or the O_DIRECT version of them same. If we do an fdatasync after these
> we really do expect the write to really be on disk, not just in the disk
> cache, right? The same is true for O_SYNC, but I ignore it for this
> write out as with Jan's patch series O_SYNC writes will be implemented
> by a range-fdatasync after the actual write, so after that this sync
> section covers it, too.
>
> It appears the following Linux filesystems implement barrier support:
>
> - btrfs
> - ext3
> - ext4
> - gfs2
> - nilfs2
> - ocfs2
> - reiserfs
> - xfs
>
> Interestingly of those only ext4, reiserfs and xfs do contain direct
> calls to blkdev_issue_flush. And unless a filesystem really creates
> a transaction for every write and forces that out on fdatasync it seems
> like all others do not actually have a chance to guarantee a cache
> flush on fdatasync.
>
> I have tested btrfs, ext3, ext4, reiserfs, and xfs with a simple test
> program that just does a buffered write into a file, and then calls
> fdatasync. All of the above filesystems issue a barrier request
> when the file blocks aren't allocated yet (for ext3 and reiserfs
> only when barriers are explicitly enabled, of course).
>
> That's not the case anymore when all blocks are already allocated.
> As expected by the above grep results reiserfs and xfs still issue a
> barrier in that case. Btrfs also performs a cache flush in every
> case which at first seems unexpected due to the lack of any
> blkdev_issue_flush call, but given that btrfs is a COW filesystem
> it actually has to allocate blocks even for an overwrite.
> Ext3 expectedly does not issue a cache flush in that case, but ext4
> unexpectedly does not issue a cache flush either. The reason for that
> is that it only issues the cache flush if the inode was dirty but
> not at all if that is not the case.
---end quoted text---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists