lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100723100408.GB3305@quack.suse.cz>
Date:	Fri, 23 Jul 2010 12:04:08 +0200
From:	Jan Kara <jack@...e.cz>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Jan Kara <jack@...e.cz>, Stephen Rothwell <sfr@...b.auug.org.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Chinner <david@...morbit.com>, linux-next@...r.kernel.org,
	LKML <linux-kernel@...r.kernel.org>,
	Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>
Subject: Re: linux-next: OOPS at boot time

  Hi Al,

On Wed 21-07-10 22:40:16, Al Viro wrote:
> On Wed, Jul 21, 2010 at 02:11:17PM +0200, Jan Kara wrote:
> >   Thanks for bisecting this. The patch series indeed seems to uncover
> > some discrepancies.
> >   Ext3 has always dirtied inode in it's ->delete_inode method (via quota
> > code). But previously clear_inode() just overwrote the state with I_CLEAR
> > and thus we never saw the BUG_ON. After Al's patches, i_state is set in
> > end_writeback() which happens earlier. In particular it happens before
> > ext3_free_inode() which dirties the inode through quota code while freeing
> > xattrs - they are accounted in i_blocks, so i_blocks are updated during
> > freeing and inode is dirtied.
> >   Actually, ext3_mark_inode_dirty() called during each mark_inode_dirty()
> > call writes the inode state to the journal so the dirty flag in the inode
> > state is in fact stale and overwriting it with I_CLEAR never mattered. In
> > this sense, the BUG_ON triggered is a false positive. But I believe this is
> > a separate story.
> 
> Could you please explain why the hell does ext2_xattr_delete_inode() call the
> dirtying variant of dquot_free_block()?  Note that the inode is well beyond
> the point where writeback would consider touching it; this mark_inode_dirty()
> will do nothing useful whatsoever at that place.
  Yes, I'm aware that dirtying inode does nothing useful at that point.
But OTOH it does no harm so why not call the standard variant of quota
function? But I have no strong opinion on this particular callsite so
using dquot_free_block_nodirty() at that place is OK with me.
 
> Anyway, I've dealt with ext3 and ext2 (both b0rken with quota) and AFAICS
> the rest of quota-supporting filesystems had been OK.  Changes:
> 
> ext3_evict_inode() hd end_writeback() shifted downstream (needed anyway)
  I like this.

> ext2 switched to nodirty variant of dquot_free_block()
  But I don't like this - see comments in the patch. I'd be much more happy
if ext2 did the same thing as ext3 - pulled ext2_xattr_delete_inode()
call out of ext2_free_inode called end_writeback() after it.

> ext2_xattr_delete_inode() doesn't try to dirty inode anymore (always
> had been pointless).
  As I wrote above I don't care about this very much.

> It's in for-next, should propagate to git.kernel.org in a few.
> 
> Diff against the buggy version follows, feel free to try.
> diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
> index e8766a3..c6c684b 100644
> --- a/fs/ext2/balloc.c
> +++ b/fs/ext2/balloc.c
> @@ -571,7 +571,7 @@ do_more:
>  error_return:
>  	brelse(bitmap_bh);
>  	release_blocks(sb, freed);
> -	dquot_free_block(inode, freed);
> +	dquot_free_block_nodirty(inode, freed);
>  }
The above call in ext2_free_blocks() modifies the inode and
ext2_free_blocks is enough "black-box" function that it should do the right
thing and dirty the inode if it modified it.
 
>  /**
> @@ -1418,7 +1418,8 @@ allocated:
>  
>  	*errp = 0;
>  	brelse(bitmap_bh);
> -	dquot_free_block(inode, *count-num);
> +	dquot_free_block_nodirty(inode, *count-num);
> +	mark_inode_dirty(inode);
>  	*count = num;
>  	return ret_block;
>  
> @@ -1428,8 +1429,10 @@ out:
>  	/*
>  	 * Undo the block allocation
>  	 */
> -	if (!performed_allocation)
> -		dquot_free_block(inode, *count);
> +	if (!performed_allocation) {
> +		dquot_free_block_nodirty(inode, *count);
> +		mark_inode_dirty(inode);
> +	}
>  	brelse(bitmap_bh);
>  	return 0;
>  }
  Sorry, but the above two changes look stupid... Why call _nodirty variant
and dirty the inode immediately after that? It happens in two other places
in your patch as well...

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ