[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <678f275b-8d78-9b0f-177f-5ff5c9c55657@oracle.com>
Date: Tue, 3 Apr 2018 14:45:28 -0700
From: Nagarathnam Muthusamy <nagarathnam.muthusamy@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
ebiederm@...ssion.com, khlebnikov@...dex-team.ru,
serge.hallyn@...ntu.com, oleg@...hat.com, luto@...capital.net,
jannh@...gle.com, prakash.sangappa@...cle.com
Subject: Re: [RESEND PATCH V4] pidns: introduce syscall translate_pid
On 04/03/2018 02:38 PM, Andrew Morton wrote:
> On Mon, 2 Apr 2018 15:57:29 -0600 nagarathnam.muthusamy@...cle.com wrote:
>
>> pid_t translate_pid(pid_t pid, int source, int target);
>>
>> This syscall converts pid from source pid-ns into pid in target pid-ns.
>> If pid is unreachable from target pid-ns it returns zero.
>>
>> Pid-namespaces are referred file descriptors opened to proc files
>> /proc/[pid]/ns/pid or /proc/[pid]/ns/pid_for_children. Negative argument
>> refers to current pid namespace, same as file /proc/self/ns/pid.
>>
>> Kernel expose virtual pids in /proc/[pid]/status:NSpid, but backward
>> translation requires scanning all tasks. Also pids could be translated
>> by sending them through unix socket between namespaces, this method is
>> slow and insecure because other side is exposed inside pid namespace.
>>
>> Examples:
>> translate_pid(pid, ns, -1) - get pid in our pid namespace
>> translate_pid(pid, -1, ns) - get pid in other pid namespace
>> translate_pid(1, ns, -1) - get pid of init task for namespace
>> translate_pid(pid, -1, ns) > 0 - is pid is reachable from ns?
>> translate_pid(1, ns1, ns2) > 0 - is ns1 inside ns2?
>> translate_pid(1, ns1, ns2) == 0 - is ns1 outside ns2?
>> translate_pid(1, ns1, ns2) == 1 - is ns1 equal ns2?
>>
>> Error codes:
>> EBADF - file descriptor is closed
>> EINVAL - file descriptor isn't pid-namespace
>> ESRCH - task not found in @source namespace
> Presumably a manpage is planned?
>
> This changelog doesn't explain what the value is to our users. I
> assume it is a performance optimization because "backward translation
> requires scanning all tasks"? If so, please show us real-world
> examples of the performance benefit from this patch, and please go to
> great lengths to explain to us why this optimisation is needed by our
> users.
One of the usecase by Oracle database involves multiple levels of
nested pid namespaces and we require pid translation between the
levels. Discussions on the particular usecase, why any of the existing
methods was not usable happened in the following thread.
https://patchwork.kernel.org/patch/10276785/
At the end, it was agreed that this patch along with flocks will solve the
issue.
Thanks,
Nagarathnam.
Powered by blists - more mailing lists