lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Sep 2015 10:37:33 +0300
From:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:	Serge Hallyn <serge.hallyn@...ntu.com>,
	Stéphane Graber <stgraber@...ntu.com>
Cc:	linux-api@...r.kernel.org, containers@...ts.linux-foundation.org,
	Oleg Nesterov <oleg@...hat.com>, linux-kernel@...r.kernel.org,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH RFC] pidns: introduce syscall getvpid

On 15.09.2015 20:41, Serge Hallyn wrote:
> Quoting Stéphane Graber (stgraber@...ntu.com):
>> On Tue, Sep 15, 2015 at 06:01:38PM +0300, Konstantin Khlebnikov wrote:
>>> On 15.09.2015 17:27, Eric W. Biederman wrote:
>>>> Konstantin Khlebnikov <khlebnikov@...dex-team.ru> writes:
>>>>
>>>>> pid_t getvpid(pid_t pid, pid_t source, pid_t target);
>>>>>
>>>>> This syscall converts pid from one pid-ns into pid in another pid-ns:
>>>>> it takes @pid in namespace of @source task (zero for current) and
>>>>> returns related pid in namespace of @target task (zero for current too).
>>>>> If pid is unreachable from target pid-ns then it returns zero.
>>>>
>>>> This interface as presented is inherently racy.  It would be better
>>>> if source and target were file descriptors referring to the namespaces
>>>> you wish to translate between.
>>>
>>> Yep, it's racy. As well as any operation with non-child pids.
>>> With file descriptors for source/target result will be racy anyway.
>>>
>>>>
>>>>> Such conversion is required for interaction between processes from
>>>>> different pid-namespaces. For example when system service talks with
>>>>> client from isolated container via socket about task in container:
>>>>
>>>> Sockets are already supported.  At least the metadata of sockets is.
>>>>
>>>> Maybe we need this but I am not convinced of it's utility.
>>>>
>>>> What are you trying to do that motivates this?
>>>
>>> I'm working on hierarchical container management system which
>>> allows to create and control nested sub-containers from containers
>>> ( https://github.com/yandex/porto ). Main server works in host and
>>> have to interact with all levels of nested namespaces. This syscall
>>> makes some operations much easier: server must remember only pid in
>>> host pid namespace and convert it into right vpid on demand.
>>
>> Note that as Eric said earlier, sending a PID inside a ucred through a
>> unix socket will have the pid translated.
>>
>> So while your solution certainly should be faster, you can already achieve
>> what you want today by doing:
>>
>> == Translate PID in container to PID in host
>>   - open a socket
>>   - setns to container's pidns
>>   - send ucred from that container containing the requested container PID
>>   - host sees the host PID
>>
>> == Translate PID on host to PID in container
>>   - open a socket
>>   - setns to container's pidns
>>   - send ucred from the host containing the request host PID
>>     (send will fail if the host PID isn't part of that container)
>>   - container sees the container PID
>
> In addition, since commit e4bc332451 : /proc/PID/status: show all sets of pid according to ns
> we now also have 'NSpid' etc in /proc/$$/status.
>

As I see this works perfectly only for converting host pid into virtual.

Backward conversion is troublesome: we have to scan all pids in host
procfs and somehow filter tasks from container and its sub-pid-ns.
Or I am missing something trivial?

-- 
Konstantin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ