[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEWA0a6jgj8vQhrijSJXUHBnCTtz0HEV66tmaVKPe83ng=3feQ@mail.gmail.com>
Date: Sat, 26 Jul 2025 14:01:20 -0700
From: Andrei Vagin <avagin@...gle.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Andrei Vagin <avagin@...il.com>, Christian Brauner <brauner@...nel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
criu@...ts.linux.dev, Linux API <linux-api@...r.kernel.org>,
stable <stable@...r.kernel.org>
Subject: Re: do_change_type(): refuse to operate on unmounted/not ours mounts
On Sat, Jul 26, 2025 at 10:53 AM Al Viro <viro@...iv.linux.org.uk> wrote:
>
> On Sat, Jul 26, 2025 at 10:12:34AM -0700, Andrei Vagin wrote:
> > On Thu, Jul 24, 2025 at 4:00 PM Al Viro <viro@...iv.linux.org.uk> wrote:
> > >
> > > On Thu, Jul 24, 2025 at 01:02:48PM -0700, Andrei Vagin wrote:
> > > > Hi Al and Christian,
> > > >
> > > > The commit 12f147ddd6de ("do_change_type(): refuse to operate on
> > > > unmounted/not ours mounts") introduced an ABI backward compatibility
> > > > break. CRIU depends on the previous behavior, and users are now
> > > > reporting criu restore failures following the kernel update. This change
> > > > has been propagated to stable kernels. Is this check strictly required?
> > >
> > > Yes.
> > >
> > > > Would it be possible to check only if the current process has
> > > > CAP_SYS_ADMIN within the mount user namespace?
> > >
> > > Not enough, both in terms of permissions *and* in terms of "thou
> > > shalt not bugger the kernel data structures - nobody's priveleged
> > > enough for that".
> >
> > Al,
> >
> > I am still thinking in terms of "Thou shalt not break userspace"...
> >
> > Seriously though, this original behavior has been in the kernel for 20
> > years, and it hasn't triggered any corruptions in all that time.
>
> For a very mild example of fun to be had there:
> mount("none", "/mnt", "tmpfs", 0, "");
> chdir("/mnt");
> umount2(".", MNT_DETACH);
> mount(NULL, ".", NULL, MS_SHARED, NULL);
> Repeat in a loop, watch mount group id leak. That's a trivial example
> of violating the assertion ("a mount that had been through umount_tree()
> is out of propagation graph and related data structures for good").
I wasn't referring to detached mounts. CRIU modifies mounts from
non-current namespaces.
>
> As for the "CAP_SYS_ADMIN within the mount user namespace" - which
> userns do you have in mind?
>
The user namespace of the target mount:
ns_capable(mnt->mnt_ns->user_ns, CAP_SYS_ADMIN)
Powered by blists - more mailing lists