[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220615085350.theicffhehgbmfep@wittgenstein>
Date: Wed, 15 Jun 2022 10:53:50 +0200
From: Christian Brauner <brauner@...nel.org>
To: Florian Weimer <fweimer@...hat.com>
Cc: Kees Cook <keescook@...omium.org>, Andrei Vagin <avagin@...il.com>,
linux-kernel@...r.kernel.org,
Dmitry Safonov <0x7f454c46@...il.com>, linux-mm@...ck.org,
Eric Biederman <ebiederm@...ssion.com>
Subject: Re: [PATCH 1/2] fs/exec: allow to unshare a time namespace on
vfork+exec
On Wed, Jun 15, 2022 at 10:14:19AM +0200, Florian Weimer wrote:
> * Christian Brauner:
>
> > For pid namespaces one problem would be that it could end up confusing a
> > process about its own pid. This was a more serious problem when the pid
> > cache was still active in glibc; but fwiw systemd still has a pid cache
> > afair.
>
> Right. glibc still has a TID cache, mainly for use with recursive
> mutexes (where we need a 32-bit thread identifier and can't perform a
> system call on every locking operation for performance reasons).
> Assuming that a non-delayed CLONE_NEWPID would also change the TID
> underneath us, we'd have subtly broken recursive mutexes.
Fwiw, you can't call CLONE_NEWPID with CLONE_THREAD. This guarantees
that threads can send signals to each other and all threads within the
same threadgroup can be reached via proc. It'd be awkward if you'd have
a thread whose thread-group leader lives in an ancestor pidns.
Even if you'd make whole threadgroup change pid namespaces immediately
it would mean allocating new TGID and TIDs in the new pid namespaces -
unless they are accidently not already allocated.
>
> vfork gets away with not updating the TID cache (which is shared with
> the parent process) because the parent process is suspended while the
> new subprocess is still running and has not execve'ed yet.
>
> Now one could argue that calling unshare automatically means that you
> must not call any glibc functions afterwards (similar to thread-creating
> clone), or at least that you cannot call any functions which are not
> async-signal-safe, but that does not match existing application
> practice. And I think we actually prefer that file servers call chroot
Yeah, that'd be a rather subtle and risky change for pid namespaces.
Powered by blists - more mailing lists