linux-hardening - Re: [PATCH 1/2] fs/exec: Explicitly unshare fs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871pss17hq.fsf@email.froward.int.ebiederm.org>
Date: Tue, 13 May 2025 17:16:49 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Jann Horn <jannh@...gle.com>
Cc: Kees Cook <kees@...nel.org>,  Mateusz Guzik <mjguzik@...il.com>,  Kees
 Cook <keescook@...omium.org>,  Christian Brauner <brauner@...nel.org>,
  Jorge Merlino <jorge.merlino@...onical.com>,  Alexander Viro
 <viro@...iv.linux.org.uk>,  Thomas Gleixner <tglx@...utronix.de>,  Andy
 Lutomirski <luto@...nel.org>,  Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>,  Andrew Morton <akpm@...ux-foundation.org>,
  linux-mm@...ck.org,  linux-fsdevel@...r.kernel.org,  John Johansen
 <john.johansen@...onical.com>,  Paul Moore <paul@...l-moore.com>,  James
 Morris <jmorris@...ei.org>,  "Serge E. Hallyn" <serge@...lyn.com>,
  Stephen Smalley <stephen.smalley.work@...il.com>,  Eric Paris
 <eparis@...isplace.org>,  Richard Haines
 <richard_c_haines@...nternet.com>,  Casey Schaufler
 <casey@...aufler-ca.com>,  Xin Long <lucien.xin@...il.com>,  "David S.
 Miller" <davem@...emloft.net>,  Todd Kjos <tkjos@...gle.com>,  Ondrej
 Mosnacek <omosnace@...hat.com>,  Prashanth Prahlad <pprahlad@...hat.com>,
  Micah Morton <mortonm@...omium.org>,  Fenghua Yu <fenghua.yu@...el.com>,
  Andrei Vagin <avagin@...il.com>,  linux-kernel@...r.kernel.org,
  apparmor@...ts.ubuntu.com,  linux-security-module@...r.kernel.org,
  selinux@...r.kernel.org,  linux-hardening@...r.kernel.org,
  oleg@...hat.com
Subject: Re: [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec

Jann Horn <jannh@...gle.com> writes:

> On Tue, May 13, 2025 at 10:57 PM Kees Cook <kees@...nel.org> wrote:
>> On May 13, 2025 6:05:45 AM PDT, Mateusz Guzik <mjguzik@...il.com> wrote:
>> >Here is my proposal: *deny* exec of suid/sgid binaries if fs_struct is
>> >shared. This will have to be checked for after the execing proc becomes
>> >single-threaded ofc.
>>
>> Unfortunately the above Chrome helper is setuid and uses CLONE_FS.
>
> Chrome first launches a setuid helper, and then the setuid helper does
> CLONE_FS. Mateusz's proposal would not impact this usecase.
>
> Mateusz is proposing to block the case where a process first does
> CLONE_FS, and *then* one of the processes sharing the fs_struct does a
> setuid execve(). Linux already downgrades such an execve() to be
> non-setuid, which probably means anyone trying to do this will get
> hard-to-understand problems. Mateusz' proposal would just turn this
> hard-to-debug edgecase, which already doesn't really work, into a
> clean error; I think that is a nice improvement even just from the
> UAPI standpoint.
>
> If this change makes it possible to clean up the kernel code a bit, even better.

What has brought this to everyone's attention just now?  This is
the second mention of this code path I have seen this week.

AKA: security/commoncap.c:cap_bprm_creds_from_file(...)
> ...
> 	/* Don't let someone trace a set[ug]id/setpcap binary with the revised
> 	 * credentials unless they have the appropriate permit.
> 	 *
> 	 * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
> 	 */
> 	is_setid = __is_setuid(new, old) || __is_setgid(new, old);
> 
> 	if ((is_setid || __cap_gained(permitted, new, old)) &&
> 	    ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) ||
> 	     !ptracer_capable(current, new->user_ns))) {
> 		/* downgrade; they get no more than they had, and maybe less */
> 		if (!ns_capable(new->user_ns, CAP_SETUID) ||
> 		    (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) {
> 			new->euid = new->uid;
> 			new->egid = new->gid;
> 		}
> 		new->cap_permitted = cap_intersect(new->cap_permitted,
> 						   old->cap_permitted);
> 	}

The actual downgrade is because a ptrace'd executable also takes
this path.

I have seen it argued rather forcefully that continuing rather than
simply failing seems better in the ptrace case.

In general I think it can be said this policy is "safe".  AKA we don't
let a shared fs struct confuse privileged applications.  So nothing
to panic about.

It looks like most of the lsm's also test bprm->unsafe.

So I imagine someone could very carefully separate the non-ptrace case
from the ptrace case but *shrug*.

Perhaps:

 	if ((is_setid || __cap_gained(permitted, new_old)) &&
 	    ((bprm->unsafe & ~LSM_UNSAFE_PTRACE) ||
 	     !ptracer_capable(current, new->user_ns))) {
+		if (!(bprm->unsafe & LSM_UNSAFE_PTRACE)) {
+			return -EPERM;
+		}
  		/* downgrade; they get no more than they had, and maybe less */
  		if (!ns_capable(new->user_ns, CAP_SETUID) ||
  		    (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) {
  			new->euid = new->uid;
  			new->egid = new->gid;
  		}
  		new->cap_permitted = cap_intersect(new->cap_permitted,
  						   old->cap_permitted);
         }

If that is what you want that doesn't look to scary.  I don't think
it simplifies anything about fs->in_exec.  As fs->in_exec is set when
the processing calling exec is the only process that owns the fs_struct.
With fs->in_exec just being a flag that doesn't allow another thread
to call fork and start sharing the fs_struct during exec.

*Shrug*

I don't see why anyone would care.  It is just a very silly corner case.

Eric