[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CE69B93.3050904@cs.columbia.edu>
Date: Sat, 20 Nov 2010 13:08:13 -0500 (EST)
From: Oren Laadan <orenl@...columbia.edu>
To: Tejun Heo <tj@...nel.org>
cc: Serge Hallyn <serge.hallyn@...onical.com>,
Kapil Arya <kapil@....neu.edu>,
Gene Cooperman <gene@....neu.edu>,
linux-kernel@...r.kernel.org, xemul@...ru,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Linux Containers <containers@...ts.osdl.org>
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
login as: orenl
Using keyboard-interactive authentication.
Password:
Access denied
Using keyboard-interactive authentication.
Password:
Last login: Fri Nov 19 10:17:21 2010 from 192.117.42.81.static.012.net.il
499:takamine[~]$ pine
PINE 4.64 COMPOSE MESSAGE
Folder: Drafts 8 Messages +
To : Tejun Heo <tj@...nel.org>
Cc : Serge Hallyn <serge.hallyn@...onical.com>,
Kapil Arya <kapil@....neu.edu>,
Gene Cooperman <gene@....neu.edu>,
linux-kernel@...r.kernel.org,
xemul@...ru,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Linux Containers <containers@...ts.osdl.org>
Attchmnt:
Subject : Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
----- Message Text -----
Hi,
[continuation of posting regarding kernel vs userspace approach]
part I: perpsectice about the types of scopes of c/r in discussion
part II: linux-cr design adn objectives
part III: comparison kernel/userspace approaches
PART II: ==PHILOSOPHY==
Linux-cr is a _generic_ c/r-engine with multiple capabilities. It can
checkpoint a full container, a process hierarchy, or a single process,
For containers, it provides guarantees like restart-ability; For the
others, it provides the flexibility so that c/r-aware applications,
libraries, helpers, and wrappers can glue what they wish to glue.
1) Transparent - completely transparent for container-c/r, and largely
so for standalone-cr ("largely" - as in except for the glue which is
needed due to loss of eco-system, not due to restarting).
2) Reliable - if checkpoint succeeds that it is guaranteed for
to succeed too (for container-c/r).
3) Preemtptive - works without requiring that checkpointed processes
be scheduled to run (and thus "collaborate")
4) Complete - covers all visible and hidden state in the kernel
about processes (even if not directly visible to userspace)
5) Efficient - can be optimized along multiple axes: _zero_ impact on
runtime, low downtime during checkpoint, partial and incremental
checkpoint, live-migration, etc.
6) Flexible - can integrate nicely with different userspace "glueing"
methods.
7) Maintainable - small part of the code is to refactor kernel code
so that it can be reused in restart; the rest is new code that in
our experience rarely changes. Same hods for the image format.
What linux-cr _does not_ do in the kernel, nor plans to support is:
1) Hardware devices: their state is per-device/vendor. Instead one
should use virtual devices (VNC for dislpay, pulseaudio for sound,
screen for ttys), or have a userspace glue to restore the state of
the device. That said, in the future vendors may opt to provide
logic for c/r in drivers, e.g. ->checkpoint, ->restart methods.
2) Userspace glue: (as defined for standalone-c/r above) the kernel
knows about processes and their state, not about their intentions.
We leave that for userspace.
3) External dependencies: (outside of the local host) the kernel does
not control what's outside the host. That is the responsibility of
userspace. (Even with live-migration, the linux-cr only restores
the local state of the TCP connections).
Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists