[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o6vw1qc4.fsf@email.froward.int.ebiederm.org>
Date: Tue, 13 May 2025 10:29:47 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Kees Cook <keescook@...omium.org>, Jann Horn <jannh@...gle.com>,
Christian Brauner <brauner@...nel.org>, Jorge Merlino
<jorge.merlino@...onical.com>, Alexander Viro <viro@...iv.linux.org.uk>,
Thomas Gleixner <tglx@...utronix.de>, Andy Lutomirski <luto@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Andrew Morton
<akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, John Johansen
<john.johansen@...onical.com>, Paul Moore <paul@...l-moore.com>, James
Morris <jmorris@...ei.org>, "Serge E. Hallyn" <serge@...lyn.com>,
Stephen Smalley <stephen.smalley.work@...il.com>, Eric Paris
<eparis@...isplace.org>, Richard Haines
<richard_c_haines@...nternet.com>, Casey Schaufler
<casey@...aufler-ca.com>, Xin Long <lucien.xin@...il.com>, "David S.
Miller" <davem@...emloft.net>, Todd Kjos <tkjos@...gle.com>, Ondrej
Mosnacek <omosnace@...hat.com>, Prashanth Prahlad <pprahlad@...hat.com>,
Micah Morton <mortonm@...omium.org>, Fenghua Yu <fenghua.yu@...el.com>,
Andrei Vagin <avagin@...il.com>, linux-kernel@...r.kernel.org,
apparmor@...ts.ubuntu.com, linux-security-module@...r.kernel.org,
selinux@...r.kernel.org, linux-hardening@...r.kernel.org,
oleg@...hat.com
Subject: Re: [PATCH 1/2] fs/exec: Explicitly unshare fs_struct on exec
Mateusz Guzik <mjguzik@...il.com> writes:
> On Thu, Oct 06, 2022 at 08:25:01AM -0700, Kees Cook wrote:
>> On October 6, 2022 7:13:37 AM PDT, Jann Horn <jannh@...gle.com> wrote:
>> >On Thu, Oct 6, 2022 at 11:05 AM Christian Brauner <brauner@...nel.org> wrote:
>> >> On Thu, Oct 06, 2022 at 01:27:34AM -0700, Kees Cook wrote:
>> >> > The check_unsafe_exec() counting of n_fs would not add up under a heavily
>> >> > threaded process trying to perform a suid exec, causing the suid portion
>> >> > to fail. This counting error appears to be unneeded, but to catch any
>> >> > possible conditions, explicitly unshare fs_struct on exec, if it ends up
>> >>
>> >> Isn't this a potential uapi break? Afaict, before this change a call to
>> >> clone{3}(CLONE_FS) followed by an exec in the child would have the
>> >> parent and child share fs information. So if the child e.g., changes the
>> >> working directory post exec it would also affect the parent. But after
>> >> this change here this would no longer be true. So a child changing a
>> >> workding directoro would not affect the parent anymore. IOW, an exec is
>> >> accompanied by an unshare(CLONE_FS). Might still be worth trying ofc but
>> >> it seems like a non-trivial uapi change but there might be few users
>> >> that do clone{3}(CLONE_FS) followed by an exec.
>> >
>> >I believe the following code in Chromium explicitly relies on this
>> >behavior, but I'm not sure whether this code is in active use anymore:
>> >
>> >https://source.chromium.org/chromium/chromium/src/+/main:sandbox/linux/suid/sandbox.c;l=101?q=CLONE_FS&sq=&ss=chromium
>>
>> Oh yes. I think I had tried to forget this existed. Ugh. Okay, so back to the drawing board, I guess. The counting will need to be fixed...
>>
>> It's possible we can move the counting after dethread -- it seems the early count was just to avoid setting flags after the point of no return, but it's not an error condition...
>>
>
> I landed here from git blame.
>
> I was looking at sanitizing shared fs vs suid handling, but the entire
> ordeal is so convoluted I'm confident the best way forward is to whack
> the problem to begin with.
>
> Per the above link, the notion of a shared fs struct across different
> processes is depended on so merely unsharing is a no-go.
>
> However, the shared state is only a problem for suid/sgid.
>
> Here is my proposal: *deny* exec of suid/sgid binaries if fs_struct is
> shared. This will have to be checked for after the execing proc becomes
> single-threaded ofc.
>
> While technically speaking this does introduce a change in behavior,
> there is precedent for doing it and seeing if anyone yells.
>
> With this in place there is no point maintainig ->in_exec or checking
> the flag.
>
> There is the known example of depending on shared fs_struct across exec.
> Hopefully there is no example of depending on execing a suid/sgid binary
> in such a setting -- it would be quite a weird setup given that for
> security reasons the perms must not be changed.
>
> The upshot of this method is that any breakage will be immediately
> visible in the form of a failed exec.
>
> Another route would be to do the mandatory unshare but only for
> suid/sgid, except that would have a hidden failure (if you will).
>
> Comments?
What is the problem that is trying to be fixed?
A uapi change to not allow sharing a fs_struct for processes that change
their cred on exec seems possible.
I said changing cred instead of suid/sgid because there are capabilities
and LSM labels that we probably want this to apply to as well.
I think such a limitation can be justified based upon having a shared
fs_struct is likely to allow confuse suid executables.
Earlier in the thread there was talk about the refcount for fs_struct.
I don't see that problem at the moment, and I don't see how dealing with
suid+sgid exectuables will have any bearing on how the refcount works.
Eric
Powered by blists - more mailing lists