lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1377454566.8757.53.camel@dabdike>
Date:	Sun, 25 Aug 2013 18:16:14 +0000
From:	James Bottomley <jbottomley@...allels.com>
To:	Kay Sievers <kay@...y.org>
CC:	Gao feng <gaofeng@...fujitsu.com>,
	"systemd-devel@...ts.freedesktop.org" 
	<systemd-devel@...ts.freedesktop.org>,
	"libvir-list@...hat.com" <libvir-list@...hat.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Linux Containers <containers@...ts.linux-foundation.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	"lxc-devel@...ts.sourceforge.net" <lxc-devel@...ts.sourceforge.net>,
	"davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [systemd-devel] [PATCH] netns: unix: only allow to find out
 unix socket in same net namespace

On Sun, 2013-08-25 at 19:37 +0200, Kay Sievers wrote:
> On Sun, Aug 25, 2013 at 7:16 PM, James Bottomley
> <jbottomley@...allels.com> wrote:
> > On Wed, 2013-08-21 at 11:51 +0200, Kay Sievers wrote:
> >> On Wed, Aug 21, 2013 at 9:22 AM, Gao feng <gaofeng@...fujitsu.com> wrote:
> >> > On 08/21/2013 03:06 PM, Eric W. Biederman wrote:
> >>
> >> >> I suspect libvirt should simply not share /run or any other normally
> >> >> writable directory with the host.  Sharing /run /var/run or even /tmp
> >> >> seems extremely dubious if you want some kind of containment, and
> >> >> without strange things spilling through.
> >>
> >> Right, /run or /var cannot be shared. It's not only about sockets,
> >> many other things will also go really wrong that way.
> >
> > This is very narrow thinking about what a container might be and will
> > cause trouble as people start to create novel uses for containers in the
> > cloud if you try to impose this on our current infrastructure.
> >
> > One of the cgroup only container uses we see at Parallels (so no
> > separate filesystem and no net namespaces) is pure apache load balancer
> > type shared hosting.  In this scenario, base apache is effectively
> > brought up in the host environment, but then spawned instances are
> > resource limited using cgroups according to what the customer has paid.
> > Obviously all apache instances are sharing /var and /run from the host
> > (mostly for logging and pid storage and static pages).  The reason some
> > hosters do this is that it allows much higher density simple web serving
> > (either static pages from quota limited chroots or dynamic pages limited
> > by database space constraints) because each "instance" shares so much
> > from the host.  The service is obviously much more basic than giving
> > each customer a container running apache, but it's much easier for the
> > hoster to administer and it serves the customer just as well for a large
> > cross section of use cases and for those it doesn't serve, the hoster
> > usually has separate container hosting (for a higher price, of course).
> 
> The "container" as we talk about has it's own init, and no, it cannot
> share /var or /run.

This is what we would call an IaaS container: bringing up init and
effectively a new OS inside a container is the closest containers come
to being like hypervisors.  It's the most common use case of Parallels
containers in the field, so I'm certainly not telling you it's a bad
idea.

> The stuff you talk about has nothing to do with that, it's not
> different from all services or a multi-instantiated service on the
> host sharing the same /run and /var.

I gave you one example: a really simplistic one.  A more sophisticated
example is a PaaS or SaaS container where you bring the OS up in the
host but spawn a particular application into its own container (this is
essentially similar to what Docker does).  Often in this case, you do
add separate mount and network namespaces to make the application
isolated and migrateable with its own IP address.  The reason you share
init and most of the OS from the host is for elasticity and density,
which are fast becoming a holy grail type quest of cloud orchestration
systems: if you don't have to bring up the OS from init and you can just
start the application from a C/R image (orders of magnitude smaller than
a full system image) and slap on the necessary namespaces as you clone
it, you have something that comes online in miliseconds which is a feat
no hypervisor based virtualisation can match.

I'm not saying don't pursue the IaaS case, it's definitely useful ...
I'm just saying it would be a serious mistake to think that's the only
use case for containers and we certainly shouldn't adjust Linux to serve
only that use case.

James

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ