[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180814171516.255036169@linuxfoundation.org>
Date: Tue, 14 Aug 2018 19:16:30 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>
Subject: [PATCH 4.14 016/104] fix __legitimize_mnt()/mntput() race
4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro <viro@...iv.linux.org.uk>
commit 119e1ef80ecfe0d1deb6378d4ab41f5b71519de1 upstream.
__legitimize_mnt() has two problems - one is that in case of success
the check of mount_lock is not ordered wrt preceding increment of
refcount, making it possible to have successful __legitimize_mnt()
on one CPU just before the otherwise final mntpu() on another,
with __legitimize_mnt() not seeing mntput() taking the lock and
mntput() not seeing the increment done by __legitimize_mnt().
Solved by a pair of barriers.
Another is that failure of __legitimize_mnt() on the second
read_seqretry() leaves us with reference that'll need to be
dropped by caller; however, if that races with final mntput()
we can end up with caller dropping rcu_read_lock() and doing
mntput() to release that reference - with the first mntput()
having freed the damn thing just as rcu_read_lock() had been
dropped. Solution: in "do mntput() yourself" failure case
grab mount_lock, check if MNT_DOOMED has been set by racing
final mntput() that has missed our increment and if it has -
undo the increment and treat that as "failure, caller doesn't
need to drop anything" case.
It's not easy to hit - the final mntput() has to come right
after the first read_seqretry() in __legitimize_mnt() *and*
manage to miss the increment done by __legitimize_mnt() before
the second read_seqretry() in there. The things that are almost
impossible to hit on bare hardware are not impossible on SMP
KVM, though...
Reported-by: Oleg Nesterov <oleg@...hat.com>
Fixes: 48a066e72d97 ("RCU'd vsfmounts")
Cc: stable@...r.kernel.org
Signed-off-by: Al Viro <viro@...iv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
fs/namespace.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -659,12 +659,21 @@ int __legitimize_mnt(struct vfsmount *ba
return 0;
mnt = real_mount(bastard);
mnt_add_count(mnt, 1);
+ smp_mb(); // see mntput_no_expire()
if (likely(!read_seqretry(&mount_lock, seq)))
return 0;
if (bastard->mnt_flags & MNT_SYNC_UMOUNT) {
mnt_add_count(mnt, -1);
return 1;
}
+ lock_mount_hash();
+ if (unlikely(bastard->mnt_flags & MNT_DOOMED)) {
+ mnt_add_count(mnt, -1);
+ unlock_mount_hash();
+ return 1;
+ }
+ unlock_mount_hash();
+ /* caller will mntput() */
return -1;
}
@@ -1210,6 +1219,11 @@ static void mntput_no_expire(struct moun
return;
}
lock_mount_hash();
+ /*
+ * make sure that if __legitimize_mnt() has not seen us grab
+ * mount_lock, we'll see their refcount increment here.
+ */
+ smp_mb();
mnt_add_count(mnt, -1);
if (mnt_get_count(mnt)) {
rcu_read_unlock();
Powered by blists - more mailing lists