lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 8 Oct 2010 09:14:31 -0400
From:	Chris Mason <chris.mason@...cle.com>
To:	Dave Chinner <david@...morbit.com>
Cc:	Al Viro <viro@...IV.linux.org.uk>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 09/18] fs: rework icount to be a locked variable

On Fri, Oct 08, 2010 at 09:15:49PM +1100, Dave Chinner wrote:
> On Fri, Oct 08, 2010 at 10:32:02AM +0100, Al Viro wrote:
> > On Fri, Oct 08, 2010 at 04:21:23PM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@...hat.com>
> > > 
> > > The inode reference count is currently an atomic variable so that it can be
> > > sampled/modified outside the inode_lock. However, the inode_lock is still
> > > needed to synchronise the final reference count and checks against the inode
> > > state.
> > > 
> > > To avoid needing the protection of the inode lock, protect the inode reference
> > > count with the per-inode i_lock and convert it to a normal variable. To avoid
> > > existing out-of-tree code accidentally compiling against the new method, rename
> > > the i_count field to i_ref. This is relatively straight forward as there
> > > are limited external references to the i_count field remaining.
> > 
> > You are overdoing the information hiding here; _way_ too many small
> > functions that don't buy you anything so far, AFAICS.
> 
> See akpm's comments on the previous version of the series.
> 
> > Moreover, why
> > the hell not make them static inlines and get rid of the exports?
> 
> Yes, that is probably sensible.
> 
> > 
> > > -	if (atomic_add_unless(&inode->i_count, -1, 1))
> > > +	/* XXX: filesystems should not play refcount games like this */
> > > +	spin_lock(&inode->i_lock);
> > > +	if (inode->i_ref > 1) {
> > > +		inode->i_ref--;
> > > +		spin_unlock(&inode->i_lock);
> > >  		return;
> > > +	}
> > > +	spin_unlock(&inode->i_lock);
> > 
> > ... or, perhaps, they needs a helper along the lines of "try to do iput()
> > if it's known to hit easy case".
> > 
> > I really don't like the look of code around -ENOSPC returns, though.
> > What exactly is going on there?  Can it e.g. interfere with that
> > delayed iput stuff?
> 
> I have no idea what the btrfs code is doing, hence I haven't tried
> to clean it up or provide any helpers for it. It looks like a hack
> around a problem in the btrfs reference counting model to me...

The problem is that we're not allowed to do the final iput for one
specific caller because it can deadlock on inode deletion.  That one
specific caller doesn't happen very often.

For the deadlock avoidance case, we do the fast atomic_dec as long as we
aren't the last holder and the slow iput-by-a-thread-at-a-safe-time if
we are.  Lots of different filesystem code dances around avoiding iput
inode deletion, should we make this a more generic setup?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ