linux-ext4 - Re: [Cluster-devel] fallocate vs O

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1321442458.2713.34.camel@menhir>
Date:	Wed, 16 Nov 2011 11:20:58 +0000
From:	Steven Whitehouse <swhiteho@...hat.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Christoph Hellwig <hch@...radead.org>, linux-btrfs@...r.kernel.org,
	linux-ext4@...r.kernel.org, mfasheh@...e.com, jlbec@...lplan.org,
	cluster-devel@...hat.com
Subject: Re: [Cluster-devel] fallocate vs O_(D)SYNC

Hi,

On Wed, 2011-11-16 at 11:54 +0100, Jan Kara wrote:
> Hello,
> 
> On Wed 16-11-11 09:43:08, Steven Whitehouse wrote:
> > On Wed, 2011-11-16 at 03:42 -0500, Christoph Hellwig wrote:
> > > It seems all filesystems but XFS ignore O_SYNC for fallocate, and never
> > > make sure the size update transaction made it to disk.
> > > 
> > > Given that a fallocate without FALLOC_FL_KEEP_SIZE very much is a data
> > > operation (it adds new blocks that return zeroes) that seems like a
> > > fairly nasty surprise for O_SYNC users.
> > 
> > In GFS2 we zero out the data blocks as we go (since our metadata doesn't
> > allow us to mark blocks as zeroed at alloc time) and also because we are
> > mostly interested in being able to do FALLOC_FL_KEEP_SIZE which we use
> > on our rindex system file in order to ensure that there is always enough
> > space to expand a filesystem.
> > 
> > So there is no danger of having non-zeroed blocks appearing later, as
> > that is done before the metadata change.
> > 
> > Our fallocate_chunk() function calls mark_inode_dirty(inode) on each
> > call, so that fsync should pick that up and ensure that the metadata has
> > been written back. So we should thus have both data and metadata stable
> > on disk.
> > 
> > Do you have some evidence that this is not happening?
>   Yeah, only that nobody calls that fsync() automatically if the fd is
> O_SYNC if I'm right. But maybe calling fdatasync() on the range which was
> fallocated from sys_fallocate() if the fd is O_SYNC would do the trick for
> most filesystems? That would match how we treat O_SYNC for other operations
> as well. I'm just not sure whether XFS wouldn't take unnecessarily big hit
> with this.
> 
> 								Honza

Ah, I see now. Sorry, I missed the original point. So that would just be
a VFS addition to check the O_(D)SYNC flag as you suggest. I've no
objections to that, it makes sense to me,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html