lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080205163404.GL25464@duck.suse.cz>
Date:	Tue, 5 Feb 2008 17:34:04 +0100
From:	Jan Kara <jack@...e.cz>
To:	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc:	Josef Bacik <jbacik@...hat.com>, Theodore Tso <tytso@....edu>,
	Andreas Dilger <adilger@....com>,
	Mingming Cao <cmm@...ibm.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: jbd2_handle and i_data_sem circular locking dependency detected

On Tue 05-02-08 21:57:03, Aneesh Kumar K.V wrote:
> On Tue, Feb 05, 2008 at 02:42:28PM +0100, Jan Kara wrote:
> > On Tue 05-02-08 17:53:42, Aneesh Kumar K.V wrote:
> > > 
> > > How about the patch below. I did the below testing
> > > a) migrate a file
> > > b) run fs_inode fsstres fsx_linux.
> > > 
> > > The intention was to find out whether the new locking is breaking any of
> > > the other expected hierarchy. It seems to fine. I didn't get any lockdep
> > > warning.
> >   I think there's a problem in the patch. I don't think you can call
> > free_ind_block() while readers are accessing the file. That could make them
> > think the file contains holes when they aren't there (or even worse read
> > freed blocks or so). So you need to lock i_data_sem before this call (that
> > means you have to move journal_extend() as well). Actually, I don't quite
> > get why ext4_journal_extend() is called in that function. You can just
> > count with the 1 credit in ext4_ext_migrate() when you call
> > ext4_journal_extend() before calling ext4_ext_swap_inode_data(). Oh,
> > probably because free_ind_block() could extend the transaction (which they
> > don't do now as far as I can see).
> 
> ext4_journal_extend is called to extend the journal by one credit to
> take care of writing the block containing inode. I moved the journal
> extend before taking i_data_sem lock.
> 
> diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
> index f97c993..dc6617a 100644
> --- a/fs/ext4/migrate.c
> +++ b/fs/ext4/migrate.c
> @@ -302,10 +302,6 @@ static int ext4_ext_swap_inode_data(handle_t *handle, struct inode *inode,
>  	struct ext4_inode_info *ei = EXT4_I(inode);
>  	struct ext4_inode_info *tmp_ei = EXT4_I(tmp_inode);
>  
> -	retval = free_ind_block(handle, inode);
> -	if (retval)
> -		goto err_out;
> -
>  	/*
>  	 * One credit accounted for writing the
>  	 * i_data field of the original inode
> @@ -318,6 +314,10 @@ static int ext4_ext_swap_inode_data(handle_t *handle, struct inode *inode,
>  	}
>  
>  	down_write(&EXT4_I(inode)->i_data_sem);
> +	retval = free_ind_block(handle, inode);
> +	if (retval)
> +		goto err_out;
> +
>  	/*
>  	 * We have the extent map build with the tmp inode.
>  	 * Now copy the i_data across
> 
> The above change also make sure we don't fail after we free the indirect
> blocks.
  Yeah, now it looks fine.

> >   BTW: That freeing code should really take into account that it can modify
> > bitmaps in different block groups. It's even not that hard to do. Just
> > before each ext4_free_blocks() in free_ind_block() you check whether you
> > have still enough credits in the handle (use h_buffer_credits) and if not,
> > extend it by some amount.
> 
> 
> I have a FIXME at migrate.c:524 documenting exactly that. The
> difficult question was by how much we should extent the journal. ? But
> in reality we might have accumulated enough journal credits, I never
> really ran across a case where we are running out of the journal credit.
  Yes, I don't think it is likely to happen in reality but if somebody
would trigger this, it would be almost impossible to track down so one
should be quite careful with these things...
  And as I described, doing it failsafe is easy - just look how
try_to_extend_transaction() in ext4/inode.c handles similar problems with
truncate.

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ