linux-kernel - Re: [PATCH 03/34] teach move_mount(2) to work with OPEN_TREE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181020114826.GM32577@ZenIV.linux.org.uk>
Date:   Sat, 20 Oct 2018 12:48:26 +0100
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Alan Jenkins <alan.christopher.jenkins@...il.com>
Cc:     David Howells <dhowells@...hat.com>, torvalds@...ux-foundation.org,
        ebiederm@...ssion.com, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, mszeredi@...hat.com
Subject: Re: [PATCH 03/34] teach move_mount(2) to work with OPEN_TREE_CLONE
 [ver #12]

On Sat, Oct 20, 2018 at 12:06:32PM +0100, Alan Jenkins wrote:

> You posted an analysis of a GPF, where you showed the reference count was
> clearly one less than it should have been.  You narrowed this down to a step
> where you connected an unmounted mount (MNT_UMOUNT) to a mounted mount.  So
> your analysis is consistent with the comment in disconnect_mount(), which
> says 1) you're not allowed to do that, 2) the reason is because of different
> reference-counting rules.  AFAICT, the GPF you analyzed would be prevented
> by the fix in do_move_mount(), checking for MNT_UMOUNT.

Not just refcounting; it's that fs_pin is really intended to have ->kill()
triggered only once.  If you look at the pin_kill() (which is where the
livelock happened) you'll see what's going on - anyone hitting it between
the first call and freeing of the object will be sleeping until ->kill()
from the first call gets through pin_remove(), at which point they bugger
off (being very careful with accessing the sucker to avoid use-after-free).

MNT_UMOUNT means that there's no way back.

> pre-date MNT_UMOUNT.  I *think* the added check in dissolve_on_fput() makes
> things right, but I don't understand enough to be sure.

That, plus making sure that do_move_mount() grabs a reference in case
of successfully attaching a tree.  I hate passing bool argument, BTW -
better just do mnt_add_count() either before attach_recursive_mnt()
and decrement on failure, or, better yet, just do it on success.  Note
that namespace_sem is held, so the damn thing *can't* disappear under
us - nobody will be able to detach it until we drop namespace_lock.

> diff --git a/fs/namespace.c b/fs/namespace.c
> index 4dfe7e23b7ee..e8d61d5f581d 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -1763,7 +1763,7 @@ void dissolve_on_fput(struct vfsmount *mnt)
>  {
>  	namespace_lock();
>  	lock_mount_hash();
> -	if (!real_mount(mnt)->mnt_ns) {
> +	if (!real_mount(mnt)->mnt_ns && !(mnt->mnt_flags & MNT_UMOUNT)) {
>  		mntget(mnt);
>  		umount_tree(real_mount(mnt), UMOUNT_CONNECTED);
>  	}
> @@ -2469,7 +2469,7 @@ static int do_move_mount(struct path *old_path, struct path *new_path)
>  	if (old->mnt_ns && !attached)
>  		goto out1;
>  
> -	if (old->mnt.mnt_flags & MNT_LOCKED)
> +	if (old->mnt.mnt_flags & (MNT_LOCKED | MNT_UMOUNT))
>  		goto out1;
>  
>  	if (old_path->dentry != old_path->mnt->mnt_root)