linux-kernel - Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081009124658.GE2952@elte.hu>
Date:	Thu, 9 Oct 2008 14:46:58 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Oren Laadan <orenl@...columbia.edu>
Cc:	containers@...ts.linux-foundation.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Serge Hallyn <serue@...ibm.com>,
	Dave Hansen <dave@...ux.vnet.ibm.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	MinChan Kim <minchan.kim@...il.com>, arnd@...db.de,
	jeremy@...p.org
Subject: Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart

* Oren Laadan <orenl@...columbia.edu> wrote:

> These patches implement basic checkpoint-restart [CR]. This version 
> (v6) supports basic tasks with simple private memory, and open files 
> (regular files and directories only). Changes mainly cleanups. See 
> original announcements below.

i'm wondering about the following productization aspect: it would be 
very useful to applications and users if they knew whether it is safe to 
checkpoint a given app. I.e. whether that app has any state that cannot 
be stored/restored yet.

Once we can do that, if the kernel can reliably tell whether it can 
safely checkpoint an application, we could start adding a kernel driven 
self-test of sorts: a self-propelled kernel feature that would 
transparently try to checkpoint various applications as it goes, and 
restore them immediately.

When such a test-kernel is booted then all that should be visible is an 
occasional slowdown due to the random save/restore cycles of various 
processes - but no actual application breakage should ever occur, and 
the kernel must not crash either. This would work a bit like 
CONFIG_RCUTORTURE: a constant test that should be transparent in terms 
of functionality.

Also, the ability to tell whether a process can be safely checkpointed 
would allow apps to rely on it - they cannot accidentally use some 
kernel feature that is not saved/restored and then lose state across a 
CR cycle.

Plus, as a bonus, the inability to CR a given application would sure 
spur the development of proper checkpointing of that given kernel state. 
We could print some once-per-boot debug warning about exactly what bit 
cannot be checkpointed yet. This would create proper pressure from 
actual users of CR.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/