[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM5jBj4ik-deECPc-G+b6JHHszvQEFA5N9uYCUS-fex5xrT1gA@mail.gmail.com>
Date: Mon, 24 Apr 2017 22:03:18 +0300
From: Cyrill Gorcunov <gorcunov@...nvz.org>
To: Kirill Tkhai <ktkhai@...tuozzo.com>
Cc: "Serge E. Hallyn" <serge@...lyn.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>, agruenba@...hat.com,
Linux API <linux-api@...r.kernel.org>,
Oleg Nesterov <oleg@...hat.com>,
Linux kernel mailing list <linux-kernel@...r.kernel.org>,
paul@...l-moore.com, Al Viro <viro@...iv.linux.org.uk>,
Andrew Vagin <avagin@...nvz.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
Michael Kerrisk <mtk.manpages@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...capital.net>,
Ingo Molnar <mingo@...nel.org>,
Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH 2/2] pid_ns: Introduce ioctl to set vector of
ns_last_pid's on ns hierarhy
On Mon, Apr 17, 2017 at 8:36 PM, Kirill Tkhai <ktkhai@...tuozzo.com> wrote:
> On implementing of nested pid namespaces support in CRIU
> (checkpoint-restore in userspace tool) we run into
> the situation, that it's impossible to create a task with
> specific NSpid effectively. After commit 49f4d8b93ccf
> "pidns: Capture the user namespace and filter ns_last_pid"
> it is impossible to set ns_last_pid on any pid namespace,
> except task's active pid_ns (before the commit it was possible
> to write to pid_ns_for_children). Thus, if a restored task
> in a container has more than one pid_ns levels, the restorer
> code must have a task helper for every pid namespace
> of the task's pid_ns hierarhy.
>
> This is a big problem, because of communication with
> a helper for every pid_ns in the hierarchy is not cheap
> and not performance-good as it implies many helpers wakeups
> to create a single task (independently, how you communicate
> with the helpers). This patch tries to decide the problem.
>
> It introduces a new pid_ns ns_ioctl(PIDNS_REQ_SET_LAST_PID_VEC),
> which allows to write a vector of last pids on pid_ns hierarchy.
> The vector is passed as a ":"-delimited string with pids,
> written in reverse order. The first number corresponds to
> the opened namespace ns_last_pid, the second is to its parent, etc.
> So, if you have the pid namespaces hierarchy like:
>
> pid_ns1 (grand father)
> |
> v
> pid_ns2 (father)
> |
> v
> pid_ns3 (child)
>
> and the ns of task's of pid_ns3 is open, then the corresponding
> vector will be "last_ns_pid3:last_ns_pid2:last_ns_pid1". This
> vector may be short and it may contain less levels, for example,
> "last_ns_pid3:last_ns_pid2" or even "last_ns_pid3", in dependence
> of which levels you want to populate.
>
> To write in a pid_ns's ns_last_pid we check that the writer task
> has CAP_SYS_ADMIN permittions in this pid_ns's user_ns.
>
> One note about struct pidns_ioc_req. It's made extensible and
> may expanded in the future. The always existing fields present
> at the moment, the future fields and they sizes may be determined
> by pidns_ioc_req::req by the future code.
>
> Signed-off-by: Kirill Tkhai <ktkhai@...tuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@...nvz.org>
Powered by blists - more mailing lists