[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231123103622.4mfjwrmxr4tl53hi@quack3>
Date: Thu, 23 Nov 2023 11:36:22 +0100
From: Jan Kara <jack@...e.cz>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: linux-fsdevel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mo Zou <lostzoumo@...il.com>, Jan Kara <jack@...e.cz>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 9/9] rename(): avoid a deadlock in the case of parents
having no common ancestor
On Wed 22-11-23 19:36:52, Al Viro wrote:
> ... and fix the directory locking documentation and proof of correctness.
> Holding ->s_vfs_rename_mutex *almost* prevents ->d_parent changes; the
> case where we really don't want it is splicing the root of disconnected
> tree to somewhere.
>
> In other words, ->s_vfs_rename_mutex is sufficient to stabilize "X is an
> ancestor of Y" only if X and Y are already in the same tree. Otherwise
> it can go from false to true, and one can construct a deadlock on that.
>
> Make lock_two_directories() report an error in such case and update the
> callers of lock_rename()/lock_rename_child() to handle such errors.
>
> And yes, such conditions are not impossible to create ;-/
>
> Signed-off-by: Al Viro <viro@...iv.linux.org.uk>
Looks good to me. Just one nit below but whether you decide to address it
or not, feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
> +// p1 != p2, both are on the same filesystem, ->s_vfs_rename_mutex is held
> static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2)
> {
> - struct dentry *p;
> + struct dentry *p = p1, *q = p2, *r;
>
> - p = d_ancestor(p2, p1);
> - if (p) {
> + while ((r = p->d_parent) != p2 && r != p)
> + p = r;
> + if (r == p2) {
> + // p is a child of p2 and an ancestor of p1 or p1 itself
> inode_lock_nested(p2->d_inode, I_MUTEX_PARENT);
> inode_lock_nested(p1->d_inode, I_MUTEX_PARENT2);
> return p;
> }
> -
> - p = d_ancestor(p1, p2);
> - inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> - inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> - return p;
> + // p is the root of connected component that contains p1
> + // p2 does not occur on the path from p to p1
> + while ((r = q->d_parent) != p1 && r != p && r != q)
> + q = r;
> + if (r == p1) {
> + // q is a child of p1 and an ancestor of p2 or p2 itself
> + inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> + inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> + return q;
> + } else if (likely(r == p)) {
> + // both p2 and p1 are descendents of p
> + inode_lock_nested(p1->d_inode, I_MUTEX_PARENT);
> + inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2);
> + return NULL;
> + } else { // no common ancestor at the time we'd been called
> + mutex_unlock(&p1->d_sb->s_vfs_rename_mutex);
It would look more natural to me if s_vfs_rename_mutex got dropped in the
callers (lock_rename(), lock_rename_child()) which have acquired the lock
instead of here. I agree it results in a bit more boiler plate code though.
> + return ERR_PTR(-EXDEV);
> + }
> }
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists