linux-kernel - Re: [PATCHi, man-pages] mount_namespaces.7: More clearly explain "locked mounts"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87r1et1io8.fsf@disp2133>
Date:   Mon, 16 Aug 2021 11:03:03 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Michael Kerrisk <mtk.manpages@...il.com>
Cc:     linux-man <linux-man@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        containers@...ts.linux-foundation.org,
        Alejandro Colomar <alx.manpages@...il.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        linux-kernel@...r.kernel.org, Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCHi, man-pages] mount_namespaces.7: More clearly explain "locked mounts"

Michael Kerrisk <mtk.manpages@...il.com> writes:

> For a long time, this manual page has had a brief discussion of
> "locked" mounts, without clearly saying what this concept is, or
> why it exists. Expand the discussion with an explanation of what
> locked mounts are, why mounts are locked, and some examples of the
> effect of locking.
>
> Thanks to Christian Brauner for a lot of help in understanding
> these details.
>
> Reported-by: Christian Brauner <christian.brauner@...ntu.com>
> Signed-off-by: Michael Kerrisk <mtk.manpages@...il.com>
> ---
>
> Hello Eric and others,
>
> After some quite helpful info from Chrstian Brauner, I've expanded
> the discussion of locked mounts (a concept I didn't really have a
> good grasp on) in the mount_namespaces(7) manual page. I would be
> grateful to receive review comments, acks, etc., on the patch below.
> Could you take a look please?
>
> Cheers,
>
> Michael
>
>  man7/mount_namespaces.7 | 73 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
>
> diff --git a/man7/mount_namespaces.7 b/man7/mount_namespaces.7
> index e3468bdb7..97427c9ea 100644
> --- a/man7/mount_namespaces.7
> +++ b/man7/mount_namespaces.7
> @@ -107,6 +107,62 @@ operation brings across all of the mounts from the original
>  mount namespace as a single unit,
>  and recursive mounts that propagate between
>  mount namespaces propagate as a single unit.)
> +.IP
> +In this context, "may not be separated" means that the mounts
> +are locked so that they may not be individually unmounted.
> +Consider the following example:
> +.IP
> +.RS
> +.in +4n
> +.EX
> +$ \fBsudo mkdir /mnt/dir\fP
> +$ \fBsudo sh \-c \(aqecho "aaaaaa" > /mnt/dir/a\(aq\fP
> +$ \fBsudo mount \-\-bind -o ro /some/path /mnt/dir\fP
> +$ \fBls /mnt/dir\fP   # Former contents of directory are invisible

Do we want a more motivating example such as a /proc/sys?

It has been common to mount over /proc files and directories that can be
written to by the global root so that users in a mount namespace may not
touch them.


> +.EE
> +.in
> +.RE
> +.IP
> +The above steps, performed in a more privileged user namespace,
> +have created a (read-only) bind mount that
> +obscures the contents of the directory
> +.IR /mnt/dir .
> +For security reasons, it should not be possible to unmount
> +that mount in a less privileged user namespace,
> +since that would reveal the contents of the directory
> +.IR /mnt/dir .
 > +.IP
> +Suppose we now create a new mount namespace
> +owned by a (new) subordinate user namespace.
> +The new mount namespace will inherit copies of all of the mounts
> +from the previous mount namespace.
> +However, those mounts will be locked because the new mount namespace
> +is owned by a less privileged user namespace.
> +Consequently, an attempt to unmount the mount fails:
> +.IP
> +.RS
> +.in +4n
> +.EX
> +$ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP
> +               \fBstrace \-o /tmp/log \e\fP
> +               \fBumount /mnt/dir\fP
> +umount: /mnt/dir: not mounted.
> +$ \fBgrep \(aq^umount\(aq /tmp/log\fP
> +umount2("/mnt/dir", 0)     = \-1 EINVAL (Invalid argument)
> +.EE
> +.in
> +.RE
> +.IP
> +The error message from
> +.BR mount (8)
> +is a little confusing, but the
> +.BR strace (1)
> +output reveals that the underlying
> +.BR umount2 (2)
> +system call failed with the error
> +.BR EINVAL ,
> +which is the error that the kernel returns to indicate that
> +the mount is locked.

Do you want to mention that you can unmount the entire subtree?  Either
with pivot_root if it is locked to "/" or with
"umount -l /path/to/propagated/directory".

>  .IP *
>  The
>  .BR mount (2)
> @@ -128,6 +184,23 @@ settings become locked
>  when propagated from a more privileged to
>  a less privileged mount namespace,
>  and may not be changed in the less privileged mount namespace.
> +.IP
> +This point can be illustrated by a continuation of the previous example.
> +In that example, the bind mount was marked as read-only.
> +For security reasons,
> +it should not be possible to make the mount writable in
> +a less privileged namespace, and indeed the kernel prevents this,
> +as illustrated by the following:
> +.IP
> +.RS
> +.in +4n
> +.EX
> +$ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP
> +               \fBmount \-o remount,rw /mnt/dir\fP
> +mount: /mnt/dir: permission denied.
> +.EE
> +.in
> +.RE
>  .IP *
>  .\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree))
>  A file or directory that is a mount point in one namespace that is not

Eric