linux-kernel - Re: checkpoint/restart ABI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <m1abewaeu2.fsf@frodo.ebiederm.org>
Date:	Thu, 28 Aug 2008 16:40:21 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	"Serge E. Hallyn" <serue@...ibm.com>
Cc:	Peter Chubb <peterc@...ato.unsw.edu.au>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Theodore Tso <tytso@....edu>, Arnd Bergmann <arnd@...db.de>,
	containers@...ts.linux-foundation.org,
	linux-kernel@...r.kernel.org, Dave Hansen <dave@...ux.vnet.ibm.com>
Subject: Re: checkpoint/restart ABI

"Serge E. Hallyn" <serue@...ibm.com> writes:

> Quoting Peter Chubb (peterc@...ato.unsw.edu.au):

>> Beefing up ptrace or fixing /proc to be a real debugging interface
>> would be a start ... when you can get at *all* the info you need,
>
> Except we don't really want to export all the info you need for a
> complete restartable checkpoint.  And especially not make it
> generally writable.

That and unless we get a lot of synergy from authors of debuggers
and debugging code it is a more general and slower interface for
no apparent gain.

> We have also started down that path using ptrace (see cryo, at
> git://git.sr71.net/~hallyn/cryodev.git).
>
> Right before the containers mini-summit, where the general agreement was
> that a complete in-kernel solution ought to be pursued, I had tried
> a restart using a binary format that read a checkpoint file and used
> cryo (userspace using ptrace) for the rest of the restart, only
> because there was no other reasonable way to set tsk->did_exec on
> restart.

Can we please describe this as the giant syscall approach.  Instead
of a complete in-kernel solution.  There are things like filesystems
that should be checkpointed separately, or not checkpointed at all.

However there is a large set of processes and process state that always
goes together and if you checkpoint a container you always want.

So building something that is roughly equivalent to a binfmt module
but that can save and restore multiple tasks with a single operation
looks like the right granularity.

>> Jeremy> Lightweight filesystem checkpointing, such as btrfs provides,
>> Jeremy> would seem like a powerful mechanism for handling a lot of the
>> Jeremy> filesystem state problems.  It would have been useful when we
>> Jeremy> did this...
>> 
>> And how!  saving bits of files was very timeconsuming.
>
> Yes, we're looking forward to using btrfs' snapshots :)

Yep.  And in the case of migration we don't even need to snapshot
a filesystem just mount it from on the target machine.  Except for
the unlinked files challenge.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/