linux-kernel - Re: [PATCH] configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081218111536.GR19128@hawkmoon.kerlabs.com>
Date:	Thu, 18 Dec 2008 12:15:36 +0100
From:	Louis Rilling <Louis.Rilling@...labs.com>
To:	linux-kernel@...r.kernel.org
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrew Morton <akpm@...ux-foundation.org>,
	cluster-devel@...hat.com, swhiteho@...hat.com
Subject: Re: [PATCH] configfs: Silence lockdep on mkdir(), rmdir() and
	configfs_depend_item()

On 18/12/08  1:27 -0800, Joel Becker wrote:
> 
> 	I know it's hard, or I'd have sent you patches :-)  In fact,
> Louis tried to use the subclass bits to make this work to a depth of N
> (where N was probably deep enough in practice).  However, this creates
> subclasses that don't get seen by the regular VFS locking - and the big
> deal here is making sure configfs's use of i_mutex meshes with the VFS.
> That is, his code made the warnings go away, but removed much of
> lockdep's ability to see when we got the locking wrong.
> 
> > The thing is, in practise it turns out that reworking code to not run
> > into these issues often makes the code better - if only for the fact
> > that taking locks is expensive and doing less is better, and holding
> > locks stifles concurrency, so holding less it better (yes, I said
> > _often_, there likely are counter cases but I don't believe configfs is
> > one of them).
> 
> 	This isn't about concurrency or speed.  This is about safety
> while configfs is attaching or (especially) detaching config_items from
> the filesystem view it presents.  When the VFS travels down a path, it
> unlocks the trailing directory.  We can't do that when tearing down
> default groups, because we need to lock that small hunk and tear it out
> atomically.
> 
> > Anyway - I'm against just turning lockdep off, that will make folks
> > complacent and let the stuff rot to bits inside - and I for one will
> > refuse to run anything using it (but since that only seems to be
> > dlm/ocfs and I'm of the believe that centralized cluster stuff sucks
> > rocks anyway that won't be a problem).
> 
> 	Oh, be nice :-)
> 	You are absolutely right that turning off lockdep leaves the
> possibility of complacency and bitrot.  That's precisely why I didn't
> like Louis' subclass solution - again, bitrot might go unnoticed.
> 	Now, I know that I will be paying attention to the locking and
> going over it with a fine-toothed comb.  But I'd much rather have an
> actual working lockdep analysis.  Whether that means we find a way for
> lockdep to describe what's happening here, or we find another way to
> keep folks out of the tree we're removing, I don't care.

Perhaps I didn't explain myself well. Quoting my original post:

<quote>
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
   lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
   i_mutexes recursively locked in different classes based on their
   depth from the top-level config_group created. This
   induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
   nesting of configfs default groups whenever lockdep is activated
   but this limit looks reasonably high. Unfortunately, this alos
   isolates VFS operations on configfs default groups from the others
   and thus lowers the chances to detect locking issues.

This patch implements solution 1).

Solution 2) looks better from lockdep's point of view, but fails with
configfs_depend_item(). This needs to rework the locking
scheme of configfs_depend_item() by removing the variable lock recursion
depth, and I think that it's doable thanks to the configfs_dirent_lock.
For now, let's stick to solution 1).
</quote>

Solution 2) does not play with i_mutex sub-classes as I proposed earlier, but
instead put default_groups' i_mutex in separate classes (actually one class per
default group depth). This is not worse than putting each run queue lock in a
separate class, as it used to be.

For instance, if a created group A has default groups A/B, A/D, and A/B/C, A's
i_mutex class will be the regular i_mutex class used everywhere else in the VFS,
A/B and A/D will have default_group_class[0], and A/B/C will have
default_group_class[1].

Of course those default_group classes will not benefit from locking schemes seen
by lockdep outside configfs, but they still will interact nicely with the VFS.
Moreover, a default group depth limit of 46 (MAX_LOCK_DEPTH - 2) looks rather
reasonable, doesn't it?

To me the real drawback of this solution is that it needs to rework locking in
configfs_depend_item(). Peter says it is preferable, I know how it could be
done, but as any code rework this may bring new bugs, and I realize that I'm
spending time to explain this while 1) I don't have much time to just explain
what could be done, 2) I'd prefer having time to code what I am explaining.
Let's see if I can show you something today.

Louis

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

Download attachment "signature.asc" of type "application/pgp-signature" (190 bytes)