linux-kernel - fsconfig parsing bugs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALxfFW4BXhEwxR0Q5LSkg-8Vb4r2MONKCcUCVioehXQKr35eHg@mail.gmail.com>
Date:   Wed, 31 Aug 2022 16:12:21 -0700
From:   Seth Jenkins <sethjenkins@...gle.com>
To:     viro@...iv.linux.org.uk, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, dhowells@...hat.com
Cc:     Jann Horn <jannh@...gle.com>,
        Natalie Silvanovich <natashenka@...gle.com>
Subject: fsconfig parsing bugs

The codebase-wide refactor efforts to using the latest fs mounting API
(with support for fsopen/fsconfig/fsmount etc.) have introduced some
bugs into mount configuration parsing in several parse_param handlers,
most notably shmem_parse_one() which can be accessed from a userns.
There are several cases where the following code pattern is used:

ctx->value = <expression>
if(ctx->value is invalid)
   goto fail;
ctx->seen |= SHMEM_SEEN_X;
break;

However, this coding pattern does not work in the case where multiple
fsconfig calls are made. For example, if I were to call fsconfig with
the key "nr_blocks" twice, the first time with a valid value, and the
second time with an invalid value, the invalid value will be persisted
and used upon creation of the mount for the value of ctx->blocks, and
consequently for sbinfo->max_blocks.

This code pattern is used for Opt_nr_blocks, Opt_nr_inodes, Opt_uid,
Opt_gid and Opt_huge. Probably the proper thing to do is to check for
validity before assigning the value to the shmem_options struct in the
fs_context.

We also see this code pattern replicated throughout other filesystems
for uid/gid resolution, including hugetlbfs, FUSE, ntfs3 and ffs.

The other outstanding issue I noticed comes from the fact that
fsconfig syscalls may occur in a different userns than that which
called fsopen. That means that resolving the uid/gid via
current_user_ns() can save a kuid that isn't mapped in the associated
namespace when the filesystem is finally mounted. This means that it
is possible for an unprivileged user to create files owned by any
group in a tmpfs mount (since we can set the SUID bit on the tmpfs
directory), or a tmpfs that is owned by any user, including the root
group/user. This is probably outside the original intention of this
code.

The fix for this bug is not quite so simple as the others. The options
that I've assessed are:

- Resolve the kuid/kgid via the fs_context namespace - this does
however mean that any task outside the fsopen'ing userns that tries to
set the uid/gid of a tmpfs will have to know that the uid/gid will be
resolved by a different namespace than that which the current task is
in. It also subtly changes the behavior of this specific subsystem in
a userland visible way.
- Globally disallow fsconfig calls originating from outside the
fs_context userns - This is a more robust solution that would prevent
any similar bugs, but it may impinge on valid mount use-cases. It's
the best from a security standpoint and if it's determined that it was
not in the original intention to be juggling user/mount namespaces
this way, it's probably the ideal solution.
- Throw EINVAL if the kuid specified cannot be mapped in the mounting
userns (and/or potentially in the fs_context userns) - This is
probably the solution that remains most faithful to all potential
use-cases, but it doesn't reduce the potential for variants in the
future in other parts of the codebase and it also introduces some
slight derivative logic bug risk.
- Don't resolve the uid/gid specified in fsconfig at all, and resolve
it during mount-time when calling an associated fill_super. This is
precedented and used in other parts of the codebase, but specificity
is lost in the final error case since an end-user cannot easily
attribute a mount failure to an unmappable uid.

I've also attached a PoC for this bug that demonstrates that an
unprivileged user can create files/directories with root uid/gid's.
There is no deadline for this issue as we can't see any obvious way to
cross a privilege boundary with this.

Thanks in advance!
--

Seth Jenkins
Information Security Engineer
Google Project Zero
sethjenkins@...gle.com

View attachment "main.c" of type "text/x-csrc" (3713 bytes)