linux-kernel - Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200704262212.27535.rjw@sisk.pl>
Date:	Thu, 26 Apr 2007 22:12:26 +0200
From:	"Rafael J. Wysocki" <rjw@...k.pl>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Pavel Machek <pavel@....cz>, Alan Cox <alan@...rguk.ukuu.org.uk>,
	Kenneth Crudup <kenny@...ix.com>,
	Nick Piggin <npiggin@...e.de>, Mike Galbraith <efault@....de>,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Con Kolivas <kernel@...ivas.org>,
	suspend2-devel@...ts.suspend2.net, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)

On Thursday, 26 April 2007 02:34, Linus Torvalds wrote:
> 
> On Wed, 25 Apr 2007, Linus Torvalds wrote:
> > 
> > The *thaw* needs to happen with devices quiescent. 
> 
> Btw, I sure as hell hope you didn't use "suspend()" for that. You're 
> (again) much better off having a totally separate function that just 
> freezes stuff.
> 
> So in the "snapshot+shutdown" path, you should have:
> 
>  - prepare_to_snapshot() - allocate memory, and possibly return errors
> 
>    We can skip this, if we just make the rule be that any devices that 
>    want to support snapshotting must always have the memory required for 
>    snapshotting pre-allocated. Most devices really do allocate memory for 
>    their state anyway, and the only real reason for the "prepare" stage 
>    here is becasue the final snapshot has to happen with interrupts off, 
>    obviously. So *if* we don't need to allocate any memory, and if we 
>    don't expect to want to accept some early error case, this is likely 
>    useless.

I think we need this.  Apparently, some device drivers need as much as 30 meg
of RAM at later stages (I don't know why and what for).

>  - snapshot() - actually save device state that is consistent with the 
>    memory image at the time. Called with interrupts off, but the device 
>    has to be usable both before and afterwards!
> 
> And I would seriously suggest that "snapshot()" be documented to not rely 
> on any DMA memory, exactly because the device has to be accessible both 
> before and after (before - because we're running and allocating memory, 
> and after - because we'll be writing thigns out). But see later:

Please note that some drivers are compiled as modules and they may deal with
uninitialized hardware (or worse, with the hardware initialized by the BIOS in
a crazy way) after restart_snapshot().  It may be better for them to actually
quiesce the devices here too to avoid problems after restart_snapshot() .

> For the "resume snapshot" path, I would suggest having 
> 
>  - freeze(): quiesce the device. This literally just does the absolute 
>    minimum to make sure that the device doesn't do anything surprising (no 
>    interrupts, no DMA, no nothing). For many devices, it's a no-op, even 
>    if they can do DMA (eg most disk controllers will do DMA, but only as 
>    an actual result of a request, and upper layers will be quiescent 
>    anyway, so they do *not* need to disable DMA)
> 
>    NOTE! The "freeze()" gets called from the *old* kernel just _before_ a
>    snapshot unpacking!!

Yes, and usually the majority of modules is not loaded at that time.

>  - restart_snapshot() - actually restart the snapshot (and usually this 
>    would involve re-setting the device, not so much trying to restore all 
>    the saved state. IOW, it's easier to just re-initialize the DMA command 
>    queues than to try to make them "atomic" in the snapshot).
> 
>    NOTE! This gets called by the *new* kernel _after_ the snapshot resume!

I think devices _should_ be resetted in restart_snapshot(), unless it's
possible to check if they have already been initialized by the "old" kernel -
but this information would have to be available from the device itself.

> And if you *want* to, I can see that you might want to actually do a 
> "unfreeze()" thing too, and make the actual shapshotting be:

What unfreeze() would be needed for in that case?

> 	/* We may not even need this.. */
> 	for_each_device() {
> 		err = prepare_to_snapshot();
> 		if (err)
> 			return err;
> 	}

We need to free as much memory as we'll need for the image creation at this
point.

> 	/* This is the real work for snapshotting */
> 	cli();
> 	for_each_device()
> 		freeze(dev);

You've added freeze() here, but it's not on your list above?

> 	for_each_device()
> 		snapshot(dev);
> 	.. snapshot current memory image ..
> 	for_each_device_depth_first()
> 		unfreeze(dev);
> 	sti();
> 
> and maybe it's worth it, but I would almost suggest that you just make the 
> rule be that any DMA etc just *has* to be re-initialized by 
> "restart_snapshot()", in which case it's not even necessary to 
> freeze/unfreeze over the device, and "snapshot()" itself only needs to 
> make sure any non-DMA data is safe.
>
> But adding the freeze/unfreeze (which is a no-op for most hardware anyway) 
> might make things easier to think about, so I would certainly not *object* 
> to it, even if I suspect it's not necessary.

I think it's not necessary.

> Anyway, the restore_snapshot() sequence should be:
> 
> 	/* Old kernel.. Normal boot, load snapshot image */
> 	cli()
> 	for_each_device()
> 		freeze(dev);
> 	restore_snapshot_image();
> 	restore_regs_and_jump_to_image();
> 	/* noreturn */
> 
> 
> 	/* New kernel, gets called at the snapshot restore address
> 	 * with interrupts off and devices frozen, and memory image
> 	 * constsntent with what it was at "snapshot()" time
> 	 */
> 	for_each_dev_depth_first()
> 		restore_snapshot(dev);
> 	/* And if you want to, just to be "symmetric"
> 
> 		for_each_dev_depth_first()
> 			unfreeze(dev)
> 
> 	   although I think you could just make "restore_snapshot()" 
> 	   implicitly unfreeze it too..

Agreed.

> 	 */
> 	sti();
> 	/* We're up */
> 
> and notice how *different* this is from what happens for s2ram. There 
> really isn't anything in common here. Exactly because s2ram simply doesn't 
> _have_ any of the issues with atomic memory images.

Agreed again.

Moreover, in the s2ram case there are no problems with device drivers compiled
as modules.

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/