[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1011221222260.14320@takamine.ncl.cs.columbia.edu>
Date: Mon, 22 Nov 2010 12:34:54 -0500 (EST)
From: Oren Laadan <orenl@...columbia.edu>
To: Gene Cooperman <gene@....neu.edu>
cc: Tejun Heo <tj@...nel.org>, Kapil Arya <kapil@....neu.edu>,
linux-kernel@...r.kernel.org, xemul@...ru,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Linux Containers <containers@...ts.osdl.org>
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
On Sun, 21 Nov 2010, Gene Cooperman wrote:
> Below, we'll summarize the four major questions that we've understood from
> this discussion so far. But before doing so, I want to point out that a single
> process or process tree will always have many possible interactions with
> the rest of the world. Within our own group, we have an internal slogan:
> "You can't checkpoint the world."
> A virtual machine can have a relatively closed world, which makes it more
> robust, but checkpointing will always have some fragile parts.
That depends of what your definition of "world". One definition
is "world := VM", as you state above. Another is "world := container"
which I stated in my post(s). You can checkpoint both.
For those cases where the "world" cannot be fully checkpointed,
I explicitly pointed that we should focus on the core c/r
functionality, because the "glue" can be done either way.
> We give four examples below:
> a. time virtualization
IMHO, irrelevant to current discussion. And btw, this is done in
linux-cr for live migration of tcp connections.
> b. external database
> c. NSCD daemon
This falls within the category of "glue", and is - as I try once
again to remind - tentirely oorthogonal to the topic of where
to do c/r.
> d. screen and other full-screen text programs
> These are not the only examples of difficult interactions with the
> rest of the world.
This actually never required a userspace "component" with Zap
or linux-cr (to the best of my knowledge)..
Even if it did - the question is not how to deal with "glue"
(you demonstrated quite well how to do that with DMTCP), but
how should teh basic, core c/r functionality work - which is
below, and orthogonal to the "glue".
Let us please focus on the base c/r engine functionality...
(gotta disconnect now .. more later)
Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists