[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6c907b21aca7b93c3b637ba0e30de4c6acb356f4.camel@themaw.net>
Date: Thu, 21 Feb 2019 18:39:28 +0800
From: Ian Kent <raven@...maw.net>
To: Christian Brauner <christian@...uner.io>
Cc: David Howells <dhowells@...hat.com>, keyrings@...r.kernel.org,
trond.myklebust@...merspace.com, sfrench@...ba.org,
James Bottomley <James.Bottomley@...senPartnership.com>,
linux-cifs@...r.kernel.org, linux-nfs@...r.kernel.org,
containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org,
linux-security-module@...r.kernel.org,
linux-fsdevel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [RFC PATCH 02/27] containers: Implement containers as kernel
objects
On Wed, 2019-02-20 at 14:26 +0100, Christian Brauner wrote:
> On Wed, Feb 20, 2019 at 10:46:24AM +0800, Ian Kent wrote:
> > On Fri, 2019-02-15 at 16:07 +0000, David Howells wrote:
> > > Implement a kernel container object such that it contains the following
> > > things:
> > >
> > > (1) Namespaces.
> > >
> > > (2) A root directory.
> > >
> > > (3) A set of processes, including one designated as the 'init' process.
> >
> > Yeah, I think a name other than init needs to be used for this
> > process.
> >
> > The problem being that there is no requirement for container
> > process 1 to behave in any way like an "init" process is
> > expected to behave and that leads to confusion (at least
> > it certainly did for me).
>
> If you look at the documentation for pid namespaces(7) you can see that
> the pid 1 inside a pid namespace is expected to behave like an init
> process:
> - "The first process created in a new namespace [...] has the PID 1,
> and is the "init" process for the namespace (see init(1))."
> - "[...] child process that is orphaned within the namespace will be
> reparented to this process rather than init(1) [...]"
> - "If the "init" process of a PID namespace terminates, the kernel
> terminates all of the processes in the namespace via a SIGKILL
> signal. This behavior reflects the fact that the "init" process is
> essential for the cor‐ rect operation of a PID namespace."
> - "Only signals for which the "init" process has established a signal
> handler can be sent to the "init" process by other members of the
> PID namespace."
> - "[...] the reboot(2) system call causes a signal to be sent to the
> namespace "init" process."
>
> This is one of the reasons why all major current container runtimes
> finally after years of failing to realize this run a stub init process
> that mimicks a dumb init. Sure, you get away with not having an init
> that behaves like an init but this is inherently broken or at least
> against the way pid namespaces were designed.
TBH I wasn't sure why the signal I sent didn't arrive, AFAICS
it should have regardless of what signals the container init
process was accepting. But it could have been due to a
different problem in my kernel code (that's very likely).
In any case it wasn't worth perusing because even if I did work
it out I had already found that the request_key sub-system wasn't
playing well with others when trying to run something within a
container's namespaces, so no point in going further ...
Ian
Powered by blists - more mailing lists