lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1236981097.30142.251.camel@nimitz>
Date:	Fri, 13 Mar 2009 14:51:37 -0700
From:	Dave Hansen <dave@...ux.vnet.ibm.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Alexey Dobriyan <adobriyan@...il.com>,
	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
	Ying Han <yinghan@...gle.com>,
	"Serge E. Hallyn" <serue@...ibm.com>, linux-api@...r.kernel.org,
	containers@...ts.linux-foundation.org, hpa@...or.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	viro@...iv.linux.org.uk, mingo@...e.hu, mpm@...enic.com,
	Andrew Morton <akpm@...ux-foundation.org>, xemul@...nvz.org,
	tglx@...utronix.de
Subject: Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ
	do?

On Fri, 2009-03-13 at 14:01 -0700, Linus Torvalds wrote:
> On Fri, 13 Mar 2009, Alexey Dobriyan wrote:
> > > Let's face it, we're not going to _ever_ checkpoint any kind of general 
> > > case process. Just TCP makes that fundamentally impossible in the general 
> > > case, and there are lots and lots of other cases too (just something as 
> > > totally _trivial_ as all the files in the filesystem that don't get rolled 
> > > back).
> > 
> > What do you mean here? Unlinked files?
> 
> Or modified files, or anything else. "External state" is a pretty damn 
> wide net. It's not just TCP sequence numbers and another machine.

This is precisely the reason that we've focused so hard on containers,
and *didn't* just jump right into checkpoint/restart; we're trying
really hard to constrain the _truly_ external things that a process can
interact with.  

The approach so far has largely been to make things are external to a
process at least *internal* to a container.  Network, pid, ipc, and uts
namespaces, for example.  An ipc/sem.c semaphore may be external to a
process, so we'll just pick the whole namespace up and checkpoint it
along with the process.

In the OpenVZ case, they've at least demonstrated that the filesystem
can be moved largely with rsync.  Unlinked files need some in-kernel TLC
(or /proc mangling) but it isn't *that* bad.

We can also make the fs problem much easier by using things like dm or
btrfs snapshotting of the block device, or restricting to where on a fs
a container is allowed to write with stuff like r/o bind mounts.

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ