[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <nbmfwdm45z3com6bdo62hac6c3kz4aerjcargsjjeacr4xrajn@4dkrpr45h34w>
Date: Fri, 18 Jul 2025 14:17:56 +0200
From: Nicolas Bouchinet <nicolas.bouchinet@....cyber.gouv.fr>
To: Aleksa Sarai <cyphar@...har.com>
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Vasiliy Kulikov <segooon@...il.com>, Christian Brauner <brauner@...nel.org>,
Olivier Bal-Petre <olivier.bal-petre@....cyber.gouv.fr>, Nicolas Bouchinet <nicolas.bouchinet@....gouv.fr>
Subject: Re: [PATCH] fs: hidepid: Fixes hidepid non dumpable behavior
Hi Aleksa,
Thanks for your reply !
On Fri, Jul 18, 2025 at 07:36:37PM +1000, Aleksa Sarai wrote:
> On 2025-07-18, nicolas.bouchinet@....cyber.gouv.fr <nicolas.bouchinet@....cyber.gouv.fr> wrote:
> > From: Nicolas Bouchinet <nicolas.bouchinet@....gouv.fr>
> >
> > The hidepid mount option documentation defines the following modes:
> >
> > - "noaccess": user may not access any `/proc/<pid>/ directories but
> > their own.
> > - "invisible": all `/proc/<pid>/` will be fully invisible to other users.
> > - "ptraceable": means that procfs should only contain `/proc/<pid>/`
> > directories that the caller can ptrace.
> >
> > We thus expect that with "noaccess" and "invisible" users would be able to
> > see their own processes in `/proc/<pid>/`.
> >
> > The implementation of hidepid however control accesses using the
> > `ptrace_may_access()` function in any cases. Thus, if a process set
> > itself as non-dumpable using the `prctl(PR_SET_DUMPABLE,
> > SUID_DUMP_DISABLE)` it becomes invisible to the user.
>
> In my view, the documentation is wrong here. This behaviour has remained
> effectively unchanged since it was introduced in 0499680a4214 ("procfs:
> add hidepid= and gid= mount options"), and the documentation was written
> by the same author (added to Cc, though they appear to be inactive since
> 2013). hidepid=ptraceable was added many years later, and so the current
> documentation seeming somewhat contradictory is probably more a result
> of a new feature being documented without rewriting the old
> documentation, rather than an incorrect implementation.
I'll change the documentation to match what it really does.
>
> A process marking itself with SUID_DUMP_DISABLE is a *very* strong
> signal that other processes (even processes owned by the same user) must
> have very restricted access to it. Given how many times they have been
> instrumental for protecting against attacks, I am quite hesitant about
> making changes to loosen these restrictions.
>
> For instance, container runtimes need to set SUID_DUMP_DISABLE to avoid
> all sorts of breakout attacks (CVE-2016-9962 and CVE-2019-5736 being the
> most famous examples, but there are plenty of others). If a container
> has been configured with a restrictive hidepid, I would expect the
> kernel to block most attempts to interact with such a process to
> non-privileged users. But this patch would loosen said restrictions.
>
> Now, many of the bits in /proc/self/* are additionally gated behind
> ptrace_may_access() checks, so this kind of change might be less
> catastrophic than at first glance, but the original concerns that
> motivated hidepid= were about /proc/self/cmdline and the uid/euid of
> processes being discoverable, and AFAICS this patch still undoes those
> protections for the cases we care about with SUID_DUMP_DISABLE.
>
> What motivated you to want to change this behaviour?
>
Well, the change is motivated by two things, the first one is the fact
that the only difference between ("noaccess", "invisible") and
"ptraceable" is the verification of the "gid" hidepid parameter. Thus,
in anyway it means that only ptraceable process can be seen.
The second motivation is that the "ptraceable" mode didn't worked with
the yama LSM, which doesn't care about `PTRACE_MODE_READ_FSCREDS` trace
mode. Thus, using hidepid "ptraceable" mode with yama "restricted",
"admin-only" or "no attach" modes doesn't do much.
I have submited a fix to yama [1] in order to make it take into account
`PTRACE_MODE_READ_FSCREDS` traces. With this yama patch, any hidepid
modes would have been affected by yama decision even though the hidepid
documentation claim that processes belonging to users are visible.
The combination of the two patches thus makes the "ptraceable" hidepid
mode work with yama without locking "noaccess" and "invisible" modes.
With hidepid "ptraceable" mode, `SUID_DUMP_DISABLE` process would be
invisible to the user.
[1]: https://lore.kernel.org/all/cf43bc15-e42d-4fde-a2b7-4fe832e177a8@oss.cyber.gouv.fr/
> > This patch fixes the `has_pid_permissions()` function in order to make
> > its behavior to match the documentation.
> >
> > Note that since `ptrace_may_access()` is not called anymore with
> > "noaccess" and "invisible", the `security_ptrace_access_check()` will no
> > longer be called either.
> >
> > Signed-off-by: Nicolas Bouchinet <nicolas.bouchinet@....gouv.fr>
> > ---
> > fs/proc/base.c | 27 ++++++++++++++++++++++++---
> > 1 file changed, 24 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index c667702dc69b8ca2531e88e12ed7a18533f294dd..fb128cb5f95fe65016fce96c75aee18c762a30f2 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -746,9 +746,12 @@ static bool has_pid_permissions(struct proc_fs_info *fs_info,
> > struct task_struct *task,
> > enum proc_hidepid hide_pid_min)
> > {
> > + const struct cred *cred = current_cred(), *tcred;
> > + kuid_t caller_uid;
> > + kgid_t caller_gid;
> > /*
> > - * If 'hidpid' mount option is set force a ptrace check,
> > - * we indicate that we are using a filesystem syscall
> > + * If 'hidepid=ptraceable' mount option is set, force a ptrace check.
> > + * We indicate that we are using a filesystem syscall
> > * by passing PTRACE_MODE_READ_FSCREDS
> > */
> > if (fs_info->hide_pid == HIDEPID_NOT_PTRACEABLE)
> > @@ -758,7 +761,25 @@ static bool has_pid_permissions(struct proc_fs_info *fs_info,
> > return true;
> > if (in_group_p(fs_info->pid_gid))
> > return true;
> > - return ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
> > +
> > + task_lock(task);
> > + rcu_read_lock();
> > + caller_uid = cred->fsuid;
> > + caller_gid = cred->fsgid;
> > + tcred = __task_cred(task);
> > + if (uid_eq(caller_uid, tcred->euid) &&
> > + uid_eq(caller_uid, tcred->suid) &&
> > + uid_eq(caller_uid, tcred->uid) &&
> > + gid_eq(caller_gid, tcred->egid) &&
> > + gid_eq(caller_gid, tcred->sgid) &&
> > + gid_eq(caller_gid, tcred->gid)) {
> > + rcu_read_unlock();
> > + task_unlock(task);
> > + return true;
> > + }
> > + rcu_read_unlock();
> > + task_unlock(task);
> > + return false;
>
> At the very least, this check needs to be gated based on
> ns_capable(get_task_mm(task)->user_ns, CAP_SYS_PTRACE), to avoid
> containers from being able to introspect SUID_DUMP_DISABLE processes
> (such as container runtimes) in the process of joining a user namespaced
> container.
>
IIUC, you want to hide non-dumpable processes joining other user
namespaces to avoid the data exposition of the non-dumpable process in
`/proc/<pid>/*` ?
Like whats is done in `__ptrace_may_access()` :
```C
mm = task->mm;
if (mm &&
((get_dumpable(mm) != SUID_DUMP_USER) &&
!ptrace_has_cap(mm->user_ns, mode)))
```
> > }
> >
> >
> >
> > ---
> > base-commit: 884a80cc9208ce75831b2376f2b0464018d7f2c4
> > change-id: 20250718-hidepid_fix-d0743d0540e7
> >
> > Best regards,
> > --
> > Nicolas Bouchinet <nicolas.bouchinet@....gouv.fr>
> >
> >
>
> --
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> https://www.cyphar.com/
Powered by blists - more mailing lists