lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1aaur26b8.fsf@fess.ebiederm.org>
Date:	Mon, 01 Mar 2010 11:24:59 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Daniel Lezcano <daniel.lezcano@...e.fr>
Cc:	Pavel Emelyanov <xemul@...allels.com>, hadi@...erus.ca,
	Patrick McHardy <kaber@...sh.net>,
	Linux Netdev List <netdev@...r.kernel.org>,
	containers@...ts.linux-foundation.org,
	Netfilter Development Mailinglist 
	<netfilter-devel@...r.kernel.org>,
	Ben Greear <greearb@...delatech.com>,
	Serge Hallyn <serue@...ibm.com>,
	Matt Helsley <matthltc@...ibm.com>
Subject: Re: [RFC][PATCH] ns: Syscalls for better namespace sharing control.

Daniel Lezcano <daniel.lezcano@...e.fr> writes:


>> Replacing struct pid is guaranteed to do all kinds of nasty things with
>> signal handling and the like, de_thread is nasty enough and you are talking
>> something worse.  So if we can change pid namespaces without changing
>> the pid I am for it.
>
> I agree with all the points you and Pavel you talked about but I don't feel
> comfortable to have the current process to switch the pid namespace because of
> the process tree hierarchy (what will be the parent of the process when you
> enter the pid namespace for example). What is the difference with the sys_bindns
> or the sys_hijack, proposed a couple of years ago ?

I was not aiming at the general enter case.  There is a very specific case
in networking where we only need a network namespace, not full blown containers
so I was seeing what could be done to handle the easy case.

The big idea is solving the namespace naming issues with bind mounts and file
descriptors.  All of the rest is window dressing for that idea.

setns looks like the easy way but what is really needed for the network namespace
is a way to open sockets that are in a specified network namespace.

> I did a suggestion some weeks ago about a new syscall 'cloneat' where the child
> process becomes the child of the targeted process specified in the
> syscall. Maybe it would be interesting to replace the 'setns' by, or add, a
> cloneat' syscall with the file descriptor passed as parameter. The copy_process
> function shall not use the nsproxy of the caller but the one provided in the fd
> argument.
>
> The newly created process becomes the child of the process where we retrieve the
> namespace with nsfd and this one have to 'waitpid' it, (the caller of 'cloneat'
> can not wait it). It's a bit similar with the CLONE_PARENT flag, except the
> creation order is inverted (the father creates for the child).
>
> So when entering the container, we specify the pid 1 of the container which is
> usually a child reaper.
>
> Does it make sense ?

Essentially.  I am not hugely interested in solving the general case
if it takes us off into tangents about pid namespace semantics.

I have just realized that while the original use case for having unix
domain sockets able to work across network namespaces was a little
weak, there are much better arguments.  Operationally it is a game
changer.  In the case where you don't need to support migration it
allows direct access to your X server and greatly simplifies the
design of a server designed to start processes in your container.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ