linux-kernel - Re: A kexec approach to hibernation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070604220910.GB2515@andrew.cmu.edu>
Date:	Mon, 4 Jun 2007 18:09:10 -0400
From:	Jeremy Maitin-Shepard <jmaitins@...rew.cmu.edu>
To:	Pavel Machek <pavel@....cz>
Cc:	Jeremy Maitin-Shepard <jbms@....edu>,
	"Rafael J. Wysocki" <rjw@...k.pl>, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...l.org>,
	Nigel Cunningham <nigel@...el.suspend2.net>
Subject: Re: A kexec approach to hibernation

On Mon, Jun 04, 2007 at 12:46:21PM +0200, Pavel Machek wrote:

> [snip]

> > You might claim then that the solution is to simply keep the network
> > driver quiesced or stopped.  But then it is impossible to write the
> > image over the network.  The way to get around this problem is to write
> > the image over the network using a fresh network stack.
> 
> The "fresh network stack" will RST any connections that were going,
> which is ugly, too.

It will only do this if you bring up the network device with the same IP 
address in the new kernel (which you would have no reason to do if you don't 
need to write the image over the network.)  Maybe the ideal behavior would be 
to tell the network stack to just ignore unexpected TCP packets, rather than 
send RST, while saving or reading the image, but that is probably not necessary 
for most uses and would be a hack.

I also think that sending RST is far better than sending ACK and then silently 
tossing out the data, which is what is currently done.  (Since I 
believe currently the network devices are brought back up along with 
all other devices after the atomic copy is made.)  Silently losing data is 
something that should only occur on a crash.  This is likely to actually be a 
somewhat serious problem for servers on which hibernate is used to move the 
server between rooms without losing connections.

Of course, you can get around this by adding a hack to not bring up network 
devices based on some option or other, but that just solves one specific 
case with an ugly solution.  In contrast, using the kexec approach, the network 
device or any other device would quite naturally not be brought back up unless 
it was needed for hibernate, and even if it is brought back up, no data are 
silently lost.

> [snip]

> > To me, it seems a lot easier to get right than the current approaches.
> 
> Well, you are certainly welcome to create the patch. "suspend3" name
> is still free, AFAICT.

I could be sneaky and call it "hibernate".  Probably nicer though to use the 
name "kexec hibernate" to be later simplified to just "hibernate".

I was hoping that everyone would like the idea so much that they would rush to 
implement it, so that I wouldn't have to try.  (I haven't written much kernel 
code before, and I have a number of other time-requiring projects to work on.) 
It looks like that is not too likely to happen though ;).

Maybe I'll try implementing it though, and find that it isn't very much work.

It would be very convenient if the current work being done to improve the 
driver interfaces for hibernate also results in the proper interfaces needed 
for this approach.  It looks like the resume path should be exactly the same 
with this approach as with the existing approaches, but the hibernate path is 
not exactly the same.  In particular, it seems that all devices should be shut 
down to a greater extent that merely the quiescing neccessary for the current 
approaches while making an atomic copy, but also they should not be completely 
shut down to the extent that they cannot be restored to the desired state when 
resuming or aborting.

> 
> If _I_ were willing to add some runtime overhead to make hibernation
> simpler, I'd just use some virtualization to do that... with added
> advantage of "hibernate here, resume on different hw".

I don't believe there is going to be any runtime overhead.

To some extent, (see some of the explanations I gave in the other e-mail I
sent a few minutes ago in reply to Nigel) I think the kexec appraoch can be
viewed as a cleaner variant of userspace hibernate.

-- 
Jeremy Maitin-Shepard
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/