linux-kernel - Re: do_change_type(): refuse to operate on unmounted/not ours mounts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAE1zp74Myaab_U5ZswjCE=ND66bT907Y=vmsk14hV89R_ugbtg@mail.gmail.com>
Date: Thu, 31 Jul 2025 10:40:40 +0800
From: Pavel Tikhomirov <snorcht@...il.com>
To: Andrei Vagin <avagin@...gle.com>
Cc: Al Viro <viro@...iv.linux.org.uk>, Andrei Vagin <avagin@...il.com>, 
	Christian Brauner <brauner@...nel.org>, linux-fsdevel <linux-fsdevel@...r.kernel.org>, 
	LKML <linux-kernel@...r.kernel.org>, criu@...ts.linux.dev, 
	Linux API <linux-api@...r.kernel.org>, stable <stable@...r.kernel.org>
Subject: Re: do_change_type(): refuse to operate on unmounted/not ours mounts

If detached mounts are our only concern, it looks like the check instead of:

if (!check_mnt(mnt)) {
        err = -EINVAL;
        goto out_unlock;
}

could've been a more relaxed one:

if (mnt_detached(mnt)) {
        err = -EINVAL;
        goto out_unlock;
}

bool mnt_detached(struct mount *mnt)
{
        return !mnt->mnt_ns;
}

not to allow propagation change only on detached mounts. (As
umount_tree sets mnt_ns to NULL.)

Also in do_mount_setattr we have a more relaxed check too:

if ((mnt_has_parent(mnt) || !is_anon_ns(mnt->mnt_ns)) && !check_mnt(mnt))
        goto out;

Best Regards, Tikhomirov Pavel.

On Sun, Jul 27, 2025 at 5:01 AM Andrei Vagin <avagin@...gle.com> wrote:
>
> On Sat, Jul 26, 2025 at 10:53 AM Al Viro <viro@...iv.linux.org.uk> wrote:
> >
> > On Sat, Jul 26, 2025 at 10:12:34AM -0700, Andrei Vagin wrote:
> > > On Thu, Jul 24, 2025 at 4:00 PM Al Viro <viro@...iv.linux.org.uk> wrote:
> > > >
> > > > On Thu, Jul 24, 2025 at 01:02:48PM -0700, Andrei Vagin wrote:
> > > > > Hi Al and Christian,
> > > > >
> > > > > The commit 12f147ddd6de ("do_change_type(): refuse to operate on
> > > > > unmounted/not ours mounts") introduced an ABI backward compatibility
> > > > > break. CRIU depends on the previous behavior, and users are now
> > > > > reporting criu restore failures following the kernel update. This change
> > > > > has been propagated to stable kernels. Is this check strictly required?
> > > >
> > > > Yes.
> > > >
> > > > > Would it be possible to check only if the current process has
> > > > > CAP_SYS_ADMIN within the mount user namespace?
> > > >
> > > > Not enough, both in terms of permissions *and* in terms of "thou
> > > > shalt not bugger the kernel data structures - nobody's priveleged
> > > > enough for that".
> > >
> > > Al,
> > >
> > > I am still thinking in terms of "Thou shalt not break userspace"...
> > >
> > > Seriously though, this original behavior has been in the kernel for 20
> > > years, and it hasn't triggered any corruptions in all that time.
> >
> > For a very mild example of fun to be had there:
> >         mount("none", "/mnt", "tmpfs", 0, "");
> >         chdir("/mnt");
> >         umount2(".", MNT_DETACH);
> >         mount(NULL, ".", NULL, MS_SHARED, NULL);
> > Repeat in a loop, watch mount group id leak.  That's a trivial example
> > of violating the assertion ("a mount that had been through umount_tree()
> > is out of propagation graph and related data structures for good").
>
> I wasn't referring to detached mounts. CRIU modifies mounts from
> non-current namespaces.
>
> >
> > As for the "CAP_SYS_ADMIN within the mount user namespace" - which
> > userns do you have in mind?
> >
>
> The user namespace of the target mount:
> ns_capable(mnt->mnt_ns->user_ns, CAP_SYS_ADMIN)
>