[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080812144905.GA16016@us.ibm.com>
Date: Tue, 12 Aug 2008 09:49:05 -0500
From: "Serge E. Hallyn" <serue@...ibm.com>
To: Peter Chubb <peterc@...ato.unsw.edu.au>
Cc: Jeremy Fitzhardinge <jeremy@...p.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
Arnd Bergmann <arnd@...db.de>,
containers@...ts.linux-foundation.org,
Theodore Tso <tytso@....edu>, linux-kernel@...r.kernel.org
Subject: Re: checkpoint/restart ABI
Quoting Peter Chubb (peterc@...ato.unsw.edu.au):
> >>>>> "Jeremy" == Jeremy Fitzhardinge <jeremy@...p.org> writes:
>
> Jeremy> Dave Hansen wrote:
> >> Arnd, Jeremy and Oren,
> >>
>
>
> Jeremy> * multiple processes * pipes * UNIX domain sockets * INET
> Jeremy> sockets (both inter and intra machine) * unlinked open files *
> Jeremy> checkpointing file content * closed files (ie, files which
> Jeremy> aren't currently open, but will be soon, esp tmp files) *
> Jeremy> shared memory * (Peter, what have I forgotten?)
>
> File sharing; multiple threads with wierd sharing arrangements (think:
> clone with various parameters, followed by exec in some of the threads
> but not others); MERT/system-V shared memory, semaphores and message
> queues; devices (audio, framebuffer, etc), HugeTLBFS, numa issues
> (pinning, memory layout), processes being debugged (so,
> checkpoint.restart a gdb/target pair), futexes, etc., etc. Linux
> process state keeps expanding.
>
> Jeremy> Having gone through this before, I don't think an all-kernel
> Jeremy> solution can work except for the most simple cases.
>
> I agree ... it's better to put mechanisms into the kernel that can
> then be used by a user-space programme to actually do the
> checkpointing and restarting.
>
> Beefing up ptrace or fixing /proc to be a real debugging interface
> would be a start ... when you can get at *all* the info you need,
Except we don't really want to export all the info you need for a
complete restartable checkpoint. And especially not make it
generally writable.
We have also started down that path using ptrace (see cryo, at
git://git.sr71.net/~hallyn/cryodev.git).
Right before the containers mini-summit, where the general agreement was
that a complete in-kernel solution ought to be pursued, I had tried
a restart using a binary format that read a checkpoint file and used
cryo (userspace using ptrace) for the rest of the restart, only
because there was no other reasonable way to set tsk->did_exec on
restart.
> quickly and easily, the userspace checkpoint falls out fairly
> naturally. You still have to work out an extensible file format to
> store stuff, and how to restore all that state you've so lovingly
> collected.
>
> Jeremy> Lightweight filesystem checkpointing, such as btrfs provides,
> Jeremy> would seem like a powerful mechanism for handling a lot of the
> Jeremy> filesystem state problems. It would have been useful when we
> Jeremy> did this...
>
> And how! saving bits of files was very timeconsuming.
Yes, we're looking forward to using btrfs' snapshots :)
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists