[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160902170256.GU2356@ZenIV.linux.org.uk>
Date: Fri, 2 Sep 2016 18:02:56 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Rainer Weikusat <rweikusat@...eradapt.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
CAI Qian <caiqian@...hat.com>,
Miklos Szeredi <miklos@...redi.hu>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Rainer Weikusat <rweikusat@...ileactivedefense.com>,
Eric Sandeen <esandeen@...hat.com>,
Network Development <netdev@...r.kernel.org>
Subject: Re: possible circular locking dependency detected
On Fri, Sep 02, 2016 at 05:10:13PM +0100, Rainer Weikusat wrote:
> > Bullshit. kern_path_create() *does* mean the same thing in all cases.
> > Namely, find the parent, lock it and leave the final name component for
> > the create-type operation. It sure as hell is not guaranteed to take
> > *all* locks that are going to be taken in process of mknod/mkdir/etc.
> > Never had been.
>
> This isn't about "all locks", it's about the lock in question. No other
> mknod operation (I'm aware of) calls this with another superblock than
> the one already acted upon by kern_path_create. This may be wrong (if
> so, feel free to correct it) but it's not "bullshit" (intentional
> deception in order to sell something to someone).
>
Never had been promised. And it's not just this lock - e.g. ->i_rwsem is
taken on the parent by kern_path_create() and on parent in underlying
filesystem by ecryptfs ->mknod() (as well as overlayfs one). bind/bind
deadlock - one for a path to ecryptfs, another for that on the raw
filesystem behind it (which can be mounted elsewhere/in another namespace/etc.)
with those paths ending in the matching directories (the last components may
be same or different - doesn't matter)
A: lock parent in ecryptfs (via kern_path_create())
B: lock the directory behind it in underlying fs (ditto)
A: grab ->readlock
B: block on ->readlock
A: call ecryptfs_mknod() and block trying to lock the directory held by B
Deadlock. And while we are at it, ecryptfs probably ought to claim transient
write access for the duration of modifications of the underlying one similar
to overlayfs. The point is, it had never been promised that you can stick
random locks just outside of ->mknod()/->mkdir()/etc. The same goes for e.g.
NFS mount of something exported by localhost; knfsd must lock the parent
directory on server before creating anything in it. Suppose you have
/srv/nfs/foo exported and mounted on the same host at /mnt/bar. bind to
/mnt/bar/barf/a vs. bind to /srv/nfs/foo/barf/b:
A: lock /mnt/bar/barf
B: lock /srv/nfs/foo/barf
A: grab ->readlock
B: block on ->readlock
A: call nfs_mknod(), wait for reply
knfsd: block trying to lock /srv/nfs/foo/barf
It's very much _not_ just overlayfs being pathological - that it certainly is,
but the problem is much wider. You could try to argue that kern_path_create()
should've known to lock all relevant directories in case of overlayfs and
ecryptfs, but it has absolutely no chance to do that in case of NFS - the
protocol doesn't allow "lock this directory, one of my next requests will
be to create something in it". Even leases are only for regular files...
Powered by blists - more mailing lists