linux-kernel - Re: [patch 3/7] fs: introduce inode writeback helpers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101130002204.GF3255@amd>
Date:	Tue, 30 Nov 2010 11:22:04 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Christoph Hellwig <hch@...radead.org>
Cc:	npiggin@...nel.dk, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 3/7] fs: introduce inode writeback helpers

On Mon, Nov 29, 2010 at 10:13:27AM -0500, Christoph Hellwig wrote:
> On Wed, Nov 24, 2010 at 01:06:13AM +1100, npiggin@...nel.dk wrote:
> > Inode dirty state cannot be securely tested without participating properly
> > in the inode writeback protocol. Some filesystems need to check this state,
> > so break out the code into helpers and make them available.
> > 
> > This could also be used to reduce strange interactions between background
> > writeback and fsync. Currently if we fsync a single page in a file, the
> > entire file gets requeued to the back of the background IO list, even if
> > it is due for writeout and has a large number of pages. That's left for
> > a later time.
> 
> Generally looks fine, but as Dave already mentioned I'd rather keep
> i_state manipulation outside the filesystems.  This could be done with

I don't see a big problem with it. They already did load it previously
in way which required inode_lock (and was buggy in part because it
didn't take that lock).


> two wrappers like the following, which should also keep the churn
> inside fsync implementations downs:
> 
> int fsync_begin(struct inode *inode, int datasync)
> {
> 	int ret = 0;
> 	unsigned mask = I_DIRTY_DATASYNC;
> 
> 	if (!datasync)
> 		mask |= I_DIRTY_SYNC;
> 
> 	spin_lock(&inode_lock);
> 	if (!inode_writeback_begin(inode, 1))
> 		goto out;
> 	if (!(inode->i_state & mask))
> 		goto out;
> 
> 	inode->i_state &= ~(I_DIRTY_SYNC | I_DIRTY_DATASYNC);
> 	ret = 1;
> out:
> 	spin_unlock(&inode_lock);
> 	return ret;
> }
> 
> static void fsync_end(struct inode *inode, int fail)
> {
> 	spin_lock(&inode_lock);
> 	if (fail)
> 		inode->i_state |= I_DIRTY_SYNC | I_DIRTY_DATASYNC;
> 	inode_writeback_end(inode);
> 	spin_unlock(&inode_lock);
> }

I prefer not to do that because it doesn't give any control over
setting or clearing the state flags (which might be done more
intelligently by the filesystem and so this function might be
unusable), and just restricts how filesystems use inode_writeback_begin
and inode lock.

Basically if you are doing anything slightly smart, you can start
inode_writeback_begin to exclude concurrent writeout, and if the
inode_lock is held, you can also prevent new changes to dirty bits
and thus keep the generic inode dirty bits in synch with your filesystem
private state.

In short, I don't see anything wrong with exporting
inode_writeback_begin and allowing i_state manipulation by filesystems
that want to do interesting things. And the wrappers AFAIKS don't add
that much -- it's not very long or difficult code.


> note that this one marks the inode fully dirty in case of a failure,
> which is a bit overkill but keeps the interface simpler.  Given that
> failure is fsync is catastrophic anyway (filesystem corruption, etc)
> that seems fine to me.
> 
> Alternatively we could add a fsync_helper that gets a function
> pointer with the ->write_inode signature and contains the above
> code before and after it. generic_file_fsync would pass the real
> ->write_inode while other filesystems could pass specific routines.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/