linux-kernel - Re: [PATCH v2] exec: Correct the permission check for unsafe exec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0N_1CEKyMHdjnvwsxUkCenmzsLe7dkUL=a6OmU4tPa6Q@mail.gmail.com>
Date: Wed, 21 May 2025 01:53:29 +0200
From: Jann Horn <jannh@...gle.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>, Richard Guy Briggs <rgb@...hat.com>, 
	"Serge E. Hallyn" <serge@...lyn.com>
Cc: Kees Cook <kees@...nel.org>, Max Kellermann <max.kellermann@...os.com>, paul@...l-moore.com, 
	jmorris@...ei.org, Andy Lutomirski <luto@...nel.org>, morgan@...nel.org, 
	Christian Brauner <christian@...uner.io>, linux-security-module@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] exec: Correct the permission check for unsafe exec

On Wed, May 21, 2025 at 12:13 AM Eric W. Biederman
<ebiederm@...ssion.com> wrote:
> Max Kellerman recently experienced a problem[1] when calling exec with
> differing uid and euid's and he triggered the logic that is supposed
> to only handle setuid executables.
>
> When exec isn't changing anything in struct cred it doesn't make sense
> to go into the code that is there to handle the case when the
> credentials change.

Maybe you could summarize the change here, something like:

"To fix it, when executing with RUID != EUID or RGID != EGID on a
non-set[ug]id transition, do not strip privileges such as ambient
capabilities and (if the execution is marked as unsafe) EUID/EGID
anymore; but keep marking such executions as secureexec."

> When looking into the history of the code I discovered that this issue
> was not present in Linux-2.4.0-test12 and was introduced in
> Linux-2.4.0-prerelease when the logic for handling this case was moved
> from prepare_binprm to compute_creds in fs/exec.c.
>
> The bug introdused was to comparing euid in the new credentials with
> uid instead of euid in the old credentials, when testing if setuid
> had changed the euid.
>
> Since triggering the keep ptrace limping along case for setuid
> executables makes no sense when it was not a setuid exec revert back
> to the logic present in Linux-2.4.0-test12.
>
> This removes the confusingly named and subtlety incorrect helpers
> is_setuid and is_setgid, that helped this bug to persist.
>
> The varaiable is_setid is renamed to id_changed (it's Linux-2.4.0-test12)
> as the old name describes what matters rather than it's cause.
>
> The code removed in Linux-2.4.0-prerelease was:
> -       /* Set-uid? */
> -       if (mode & S_ISUID) {
> -               bprm->e_uid = inode->i_uid;
> -               if (bprm->e_uid != current->euid)
> -                       id_change = 1;
> -       }
> -
> -       /* Set-gid? */
> -       /*
> -        * If setgid is set but no group execute bit then this
> -        * is a candidate for mandatory locking, not a setgid
> -        * executable.
> -        */
> -       if ((mode & (S_ISGID | S_IXGRP)) == (S_ISGID | S_IXGRP)) {
> -               bprm->e_gid = inode->i_gid;
> -               if (!in_group_p(bprm->e_gid))
> -                       id_change = 1;
>
> Linux-2.4.0-prerelease added the current logic as:
> +       if (bprm->e_uid != current->uid || bprm->e_gid != current->gid ||
> +           !cap_issubset(new_permitted, current->cap_permitted)) {
> +                current->dumpable = 0;
> +
> +               lock_kernel();
> +               if (must_not_trace_exec(current)
> +                   || atomic_read(&current->fs->count) > 1
> +                   || atomic_read(&current->files->count) > 1
> +                   || atomic_read(&current->sig->count) > 1) {
> +                       if(!capable(CAP_SETUID)) {
> +                               bprm->e_uid = current->uid;
> +                               bprm->e_gid = current->gid;
> +                       }
> +                       if(!capable(CAP_SETPCAP)) {
> +                               new_permitted = cap_intersect(new_permitted,
> +                                                       current->cap_permitted);
> +                       }
> +               }
> +               do_unlock = 1;
> +       }
>
> I have condenced the logic from Linux-2.4.0-test12 to just:
>         id_changed = !uid_eq(new->euid, old->euid) || !in_group_p(new->egid);
>
> This change is userspace visible, but I don't expect anyone to care.
>
> For the bug that is being fixed to trigger bprm->unsafe has to be set.
> The variable bprm->unsafe is set when ptracing an executable, when
> sharing a working directory, or when no_new_privs is set.  Properly
> testing for cases that are safe even in those conditions and doing
> nothing special should not affect anyone.  Especially if they were
> previously ok with their credentials getting munged
>
> To minimize behavioural changes the code continues to set secureexec
> when euid != uid or when egid != gid.
>
> Reported-by: Max Kellermann <max.kellermann@...os.com>
> Fixes: 64444d3d0d7f ("Linux version 2.4.0-prerelease")
> [1] https://lkml.kernel.org/r/20250306082615.174777-1-max.kellermann@ionos.com
> v1: https://lkml.kernel.org/r/878qmxsuy8.fsf@email.froward.int.ebiederm.org
> Reviewed-by: Serge Hallyn <serge@...lyn.com>
> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>

Looks good to me overall, thanks for figuring out the history of this
not-particularly-easy-to-understand code and figuring out the right
fix.

Reviewed-by: Jann Horn <jannh@...gle.com>

> ---
>  security/commoncap.c | 20 ++++++++------------
>  1 file changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 28d4248bf001..6bd4adeb4795 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -856,12 +856,6 @@ static void handle_privileged_root(struct linux_binprm *bprm, bool has_fcap,
>  #define __cap_full(field, cred) \
>         cap_issubset(CAP_FULL_SET, cred->cap_##field)
>
> -static inline bool __is_setuid(struct cred *new, const struct cred *old)
> -{ return !uid_eq(new->euid, old->uid); }
> -
> -static inline bool __is_setgid(struct cred *new, const struct cred *old)
> -{ return !gid_eq(new->egid, old->gid); }
> -
>  /*
>   * 1) Audit candidate if current->cap_effective is set
>   *
> @@ -891,7 +885,7 @@ static inline bool nonroot_raised_pE(struct cred *new, const struct cred *old,
>             (root_privileged() &&
>              __is_suid(root, new) &&
>              !__cap_full(effective, new)) ||
> -           (!__is_setuid(new, old) &&
> +           (uid_eq(new->euid, old->euid) &&
>              ((has_fcap &&
>                __cap_gained(permitted, new, old)) ||
>               __cap_gained(ambient, new, old))))

Just a comment unrelated to this specific change:

Wow, nonroot_raised_pE() is hard to read, but I guess luckily it's
only used for auditing, so it's not that important...

The diff of the whole expression that decides whether to audit the
execution, reindented for clarity, looks like this:

(
  __cap_grew(effective, ambient, new) &&
  !(
    __cap_full(effective, new)
    &&
    (__is_eff(root, new) || __is_real(root, new))
    &&
    root_privileged()
  )
)
||
(
  root_privileged()
  &&
  __is_suid(root, new)
  &&
  !__cap_full(effective, new)
)
||
(
- !__is_setuid(new, old)
+ uid_eq(new->euid, old->euid)
  &&
  (
    (has_fcap && __cap_gained(permitted, new, old))
    ||
    __cap_gained(ambient, new, old)
  )
)

And we do auditing in three scenarios (situation with the patch applied):

1. We have effective caps that are not ambient caps, and we don't have
a full capability set based on having a root-privileged EUID or RUID.
2. We are in a suid-like execution but are not getting a full capability set.
3. [changed part] We are not changing UID through a suid execution,
but either we gained ambient capabilities through the execution (???)
or we gained permitted capabilities while executing a file with fcaps.

I am highly confused by the __cap_gained(ambient, new, old) check
because I have no idea why ambient capabilities would ever increase on
exec. I thought they can only *decrease* on exec? Apparently that was
introduced in commit dbbbe1105ea6a ("capabilities: audit log other
surprising conditions").

Anyway, yeah, your change looks fine, just the preexisting code looks
dodgy to me.

> @@ -917,7 +911,7 @@ int cap_bprm_creds_from_file(struct linux_binprm *bprm, const struct file *file)
>         /* Process setpcap binaries and capabilities for uid 0 */
>         const struct cred *old = current_cred();
>         struct cred *new = bprm->cred;
> -       bool effective = false, has_fcap = false, is_setid;
> +       bool effective = false, has_fcap = false, id_changed;
>         int ret;
>         kuid_t root_uid;
>
> @@ -941,9 +935,9 @@ int cap_bprm_creds_from_file(struct linux_binprm *bprm, const struct file *file)
>          *
>          * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
>          */
> -       is_setid = __is_setuid(new, old) || __is_setgid(new, old);
> +       id_changed = !uid_eq(new->euid, old->euid) || !in_group_p(new->egid);

Hm, so when we change from one EGID to another EGID which was already
in our groups list, we don't treat it as a privileged exec? Which is
okay because, while an unprivileged user would not just be allowed to
change their EGID to a GID from their groups list themselves through
__sys_setregid(), they would be allowed to create a new setgid binary
owned by a group from their groups list and then execute that?

That's fine with me, though it seems a little weird to me. setgid exec
is changing our creds and yet we're not treating it as a "real" setgid
execution because the execution is only granting privileges that
userspace could have gotten anyway.

> -       if ((is_setid || __cap_gained(permitted, new, old)) &&
> +       if ((id_changed || __cap_gained(permitted, new, old)) &&
>             ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) ||
>              !ptracer_capable(current, new->user_ns))) {
>                 /* downgrade; they get no more than they had, and maybe less */
> @@ -960,7 +954,7 @@ int cap_bprm_creds_from_file(struct linux_binprm *bprm, const struct file *file)
>         new->sgid = new->fsgid = new->egid;
>
>         /* File caps or setid cancels ambient. */
> -       if (has_fcap || is_setid)
> +       if (has_fcap || id_changed)
>                 cap_clear(new->cap_ambient);
>
>         /*
> @@ -993,7 +987,9 @@ int cap_bprm_creds_from_file(struct linux_binprm *bprm, const struct file *file)
>                 return -EPERM;
>
>         /* Check for privilege-elevated exec. */
> -       if (is_setid ||
> +       if (id_changed ||
> +           !uid_eq(new->euid, old->uid) ||
> +           !gid_eq(new->egid, old->gid) ||
>             (!__is_real(root_uid, new) &&
>              (effective ||
>               __cap_grew(permitted, ambient, new))))
> --
> 2.41.0
>