linux-kernel - Re: [PATCH] configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090126132453.GD7532@hawkmoon.kerlabs.com>
Date:	Mon, 26 Jan 2009 14:24:53 +0100
From:	Louis Rilling <Louis.Rilling@...labs.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Joel Becker <Joel.Becker@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, cluster-devel@...hat.com,
	swhiteho <swhiteho@...hat.com>
Subject: Re: [PATCH] configfs: Silence lockdep on mkdir(), rmdir() and
	configfs_depend_item()

Hi Peter,

Thank you for continuing this discussion :)

On 26/01/09 13:30 +0100, Peter Zijlstra wrote:
> On Thu, 2008-12-18 at 14:58 -0800, Joel Becker wrote:
> > On Thu, Dec 18, 2008 at 01:28:28PM +0100, Peter Zijlstra wrote:
> > > In fact, both (configfs) mkdir and rmdir seem to synchronize on
> > > su_mutex..
> > > 
> > >  mkdir B/C/bar
> > > 
> > >    C.i_mutex
> > >      su_mutex
> > > 
> > > vs
> > > 
> > >  rmdir foo
> > > 
> > >    parent(foo).i_mutex
> > >      foo.i_mutex
> > >        su_mutex
> > > 
> > > 
> > > once holding the rmdir su_mutex you can check foo's user-content, since
> > > any mkdir will be blocked. All you have to do is then re-validate in
> > > mkdir's su_mutex that !IS_DEADDIR(C).
> > 
> > 	We explicitly do not take any i_mutex locks after taking
> > su_mutex.  That's an ABBA risk.  su_mutex protects the hierarchy of
> > config_items.  i_mutex protects the vfs view thereof.
> 
> I don't think I was suggesting that. All you need is to serialize any
> mkdir/creat against the rmdir of the youngest non-default group, and you
> can do that by holding su_mutex.
> 
> In rmdir, you already own all the i_mutex instances you need to uncouple
> the whole tree, all you need to do is validate that its indeed empty --
> you don't need i_mutex's for that, because you're holding su_mutex, and
> any concurrent mkdir/creat will be blocking on that.
> 
> If you find it empty, just mark everybody DEAD, drop su_mutex and
> decouple. All concurrent mkdir/creat thingies that were blocking will
> now bail because their parent is found DEAD.
> 
> > 	If you look in mkdir, we take su_mutex, get a new item from the
> > client subsystem, then drop su_mutex. 
> 
> All you need to do before dropping su_mutex again is checking
> IS_DEADDIR(), if so, you just fail the whole mkdir() no extra i_mutex's
> needed.
> 
> >  After that, we go about building
> > our filesystem structure, using i_mutex where appropriate. 
> 
> Sure, but its ok to grow the default groups non-atomically, right? mkdir
> will only need to check that everything is empty in as far as it has
> been linked, and ensure the not yet linked entries won't be.
> 
> >  More
> > importantly is rmdir(2), where we use i_mutex in
> > configfs_detach_group(), but are not holding su_sem.  Only when
> > configfs_detach_group() has successfully returned and we have torn down
> > the filesystem structure do we take su_mutex and tear down the
> > config_item structure.
> 
> The only thing that matters is that you can hold su_mutex inside
> i_mutex.
> 
> 
> configfs_rmdir( "foo" )
> {
>  /* we hold i_mutex for foo and its parent */
> 
>  mutex_lock(&subsys->su_mutex);
>  if (default_tree_empty())
>   mark_default_tree_dead();
>  else
>   ret = -EBUSY;
>  mutex_unlock(&subsys->su_mutex);
> 
>  if (ret)
>   return ret;
> 
>  /* do actual unlink foo */
> }
> 
> 
> configfs_mkdir( "B/A/bar" )
> {
>  /* we hold i_mutex for A */
> 
>  mutex_lock(&subsys->su_mutex);
>  if (IS_DEADDIR(A))
>   ret = -EINVAL; /* or whatever */
> 
>  /* increase A's use count, so default_tree_empty() will fail. *
>  inc_A_or_subsys_use_count();
>  mutex_unlock(&subsys->su_mutex);
>  if (ret)
>   return ret;
> 
>  /* do actual mkdir */
> }
> 
> 
> Surely something along these lines ought to work?

I may have missed something important here, but how does your suggestion remove
the need for recursively locking inodes?

In configfs_rmdir(), we do not need to lock default groups inodes to prevent
racing mkdir() under them. This race is already dealt with earlier in
configfs_detach_prep(), which tries to set the CONFIGS_USET_DROPPING flag on the
group to remove and on all its hierarchy of default groups, protected by
configfs_dirent_lock. The matching logic in configfs_mkdir() may look a bit
burried but is simple: configfs_attach_item()/configfs_attach_group() eventually
calls configfs_new_dirent(), which fails whenever the parent is tagged with
CONFIGFS_USET_DROPPING. Again, no inode locking is required for this logic.

However configfs_rmdir() and configfs_mkdir() (recursively) lock inodes because
this is how the VFS works when removing/adding entries under a directory which
has already lived in the dcache.

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)