linux-kernel - Re: [PATCH v3 1/2] exec: add PR_HIDE_SELF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230130100602.elyvs6oorfzukjwh@wittgenstein>
Date:   Mon, 30 Jan 2023 11:06:02 +0100
From:   Christian Brauner <brauner@...nel.org>
To:     Colin Walters <walters@...bum.org>
Cc:     Giuseppe Scrivano <gscrivan@...hat.com>,
        Aleksa Sarai <cyphar@...har.com>, linux-kernel@...r.kernel.org,
        Kees Cook <keescook@...omium.org>, bristot@...hat.com,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Alexander Larsson <alexl@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>, bmasney@...hat.com
Subject: Re: [PATCH v3 1/2] exec: add PR_HIDE_SELF_EXE prctl

On Mon, Jan 30, 2023 at 10:53:31AM +0100, Christian Brauner wrote:
> On Sun, Jan 29, 2023 at 01:12:45PM -0500, Colin Walters wrote:
> > 
> > 
> > On Sun, Jan 29, 2023, at 11:58 AM, Christian Brauner wrote:
> > > On Sun, Jan 29, 2023 at 08:59:32AM -0500, Colin Walters wrote:
> > >> 
> > >> 
> > >> On Wed, Jan 25, 2023, at 11:30 AM, Giuseppe Scrivano wrote:
> > >> > 
> > >> > After reading some comments on the LWN.net article, I wonder if
> > >> > PR_HIDE_SELF_EXE should apply to CAP_SYS_ADMIN in the initial user
> > >> > namespace or if in this case root should keep the privilege to inspect
> > >> > the binary of a process.  If a container runs with that many privileges
> > >> > then it has already other ways to damage the host anyway.
> > >> 
> > >> Right, that's what I was trying to express with the "make it work the same as map_files".  Hiding the entry entirely even for initial-namespace-root (real root) seems like it's going to potentially confuse profiling/tracing/debugging tools for no good reason.
> > >
> > > If this can be circumvented via CAP_SYS_ADMIN 
> > 
> > To be clear, I'm proposing CAP_SYS_ADMIN in the current user namespace at the time of the prctl().  (Or if keeping around a reference just for this is too problematic, perhaps hardcoding to the init ns)
> 
> Oh no, I fully understand. The point was that the userspace fix protects
> even against attackers with CAP_SYS_ADMIN in init_user_ns. And that was
> important back then and is still relevant today for some workloads.
> 
> For unprivileged containers where host and container are separate by a
> meaningful user namespace boundary this whole mitigation is irrelevant
> as the binary can't be overwritten.
> 
> > 
> > A process with CAP_SYS_ADMIN in a child namespace would still not be able to read the binary.
> > 
> > > then this mitigation
> > > becomes immediately way less interesting because the userspace
> > > mitigation we came up with protects against CAP_SYS_ADMIN as well
> > > without any regression risk. 
> > 
> > The userspace mitigation here being "clone self to memfd"?  But that's a sufficiently ugly workaround that it's created new problems; see https://lwn.net/Articles/918106/
> 
> But this is a problem with the memfd api not with the fix. Following the
> thread the ability to create executable memfds will stay around. As it
> should be given how long this has been supported. And they have backward
> compatibility in mind which is great.

Following up from yesterday's promise to check with the criu org I'm
part of: this is going to break criu unforunately as it dumps (and
restores) /proc/self/exe. Even with an escape hatch we'd still risk
breaking it. Whereas again, the memfd solution doesn't cause those
issues.

Don't get me wrong it's pretty obvious that I was pretty supportive of
this fix especially because it looked rather simple but this is turning
out to be less simple than we tought. I don't think that this is worth
it given the functioning fixes we already have.

The good thing is that - even if it will take a longer - that Aleksa's
patchset will provide a more general solution by making it possible for
runc/crun/lxc to open the target binary with a restricted upgrade mask
making it impossible to open the binary read-write again. This won't
break criu and will fix this issue and is generally useful.