lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 8 Sep 2009 00:14:54 +0200
From:	Jan Kara <jack@...e.cz>
To:	Chris Mason <chris.mason@...cle.com>
Cc:	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	tytso@....edu, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH RFC] Add locking to ext3_do_update_inode

On Fri 04-09-09 16:06:13, Chris Mason wrote:
> Hello everyone,
> 
> I've been struggling with this off and on while I've been testing the
> data=guarded work.  The symptom is corrupted orphan lists and inodes
> with the wrong i_size stored on disk.  I was convinced the
> data=guarded code was just missing a call to ext3_mark_inode_dirty, but
> tracing showed the i_disksize I was sending to ext3_mark_inode_dirty
> wasn't actually making it to the drive.
> 
> ext3_mark_inode_dirty can be called without locks held (atime updates
> and a few others), so the data=guarded code uses locks while updating
> the in-memory inode, and then calls ext3_mark_inode_dirty
> without any locks held.
> 
> But, ext3_mark_inode_dirty has no internal locking to make sure that
> only one CPU is updating the buffer head at a time.  Generally this
> works out ok because everyone that changes the inode then calls
> ext3_mark_inode_dirty themselves.  Even though it races, eventually
> someone updates the buffer heads and things move on.
> 
> But there is still a risk of the wrong values getting in, and the
> data=guarded code seems to hit the race very often.
> 
> Since everyone that changes the inode also logs it, it should be
> possible to fix this with some memory barriers.  I'll leave that as an
> exercise to the reader and lock the buffer head instead.
>
> It it probably a good idea to have a different patch series for lockless
> bit flipping on the ext3 i_state field.  ext3_do_update_inode &= clears
> EXT3_STATE_NEW without any locks held.
> 
> Signed-off-by: Chris Mason <chris.mason@...cle.com>
  The patch looks good. I've added it to my tree...

								Honza

> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index 00f5dc1..6a0a056 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -3466,6 +3479,10 @@ static int ext3_do_update_inode(handle_t *handle,
>  	struct buffer_head *bh = iloc->bh;
>  	int err = 0, rc, block;
>  
> +again:
> +	/* we can't allow multiple procs in here at once, its a bit racey */
> +	lock_buffer(bh);
> +
>  	/* For fields not not tracking in the in-memory inode,
>  	 * initialise them to zero for new inodes. */
>  	if (ei->i_state & EXT3_STATE_NEW)
> @@ -3525,16 +3542,20 @@ static int ext3_do_update_inode(handle_t *handle,
>  			       /* If this is the first large file
>  				* created, add a flag to the superblock.
>  				*/
> +				unlock_buffer(bh);
>  				err = ext3_journal_get_write_access(handle,
>  						EXT3_SB(sb)->s_sbh);
>  				if (err)
>  					goto out_brelse;
> +
>  				ext3_update_dynamic_rev(sb);
>  				EXT3_SET_RO_COMPAT_FEATURE(sb,
>  					EXT3_FEATURE_RO_COMPAT_LARGE_FILE);
>  				handle->h_sync = 1;
>  				err = ext3_journal_dirty_metadata(handle,
>  						EXT3_SB(sb)->s_sbh);
> +				/* get our lock and start over */
> +				goto again;
>  			}
>  		}
>  	}
> @@ -3557,6 +3578,7 @@ static int ext3_do_update_inode(handle_t *handle,
>  		raw_inode->i_extra_isize = cpu_to_le16(ei->i_extra_isize);
>  
>  	BUFFER_TRACE(bh, "call ext3_journal_dirty_metadata");
> +	unlock_buffer(bh);
>  	rc = ext3_journal_dirty_metadata(handle, bh);
>  	if (!err)
>  		err = rc;
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ