linux-kernel - Re: [Bug #13058] First hibernation attempt fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0904170813550.4042@localhost.localdomain>
Date:	Fri, 17 Apr 2009 08:55:17 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jens Axboe <jens.axboe@...cle.com>
cc:	Alan Jenkins <alan-jenkins@...fmail.co.uk>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>
Subject: Re: [Bug #13058] First hibernation attempt fails

On Fri, 17 Apr 2009, Jens Axboe wrote:
> 
> Given the somewhat odd nature of the bug and the requirements to trigger
> it, how confident are you in the bisection results?

I suspect it's timing-dependent. 

The failure case is a ENOMEM returned from the "echo disk > /sys/power/state", 
and sadly there are a _lot_ of potential sources of ENOMEM's in the path. 
And a numbe of them come from GFP_ATOMIC allocations etc.

Now, that explains why it only happens while in X (more memory being 
used), and also why it succeeds the second time (the first try will have 
triggered VM activity and then free'd the pages it allocated up to that 
point).

IOW, I bet it would work on the first try if you were to just run 
something like

	ptr = malloc(BIGNUM);
	memset(ptr, 0, BIGNUM);
	exit(0);

first - just to make room for stuff.

And the thing is, swsusp_save() really does do odd things. For example, to 
get rid of unnecessary memory, it does "drain_local_pages()", where the 
"local" is "local cpu". Why does it do that? Likely nobody knows.

Now, that won't matter in Alan's case (he is UP), but the point is, the 
swsuspend code does these random things to try to free up memory, and I 
suspect it's mostly been a trial-and-error thing. And then subtle changes 
in memory usage when allocating or writing things out will change things.

For example, there is a magic "PAGES_FOR_IO" #define, which is somewhat 
arbitrarily set to 4MB worth of pages. Where did that number come from? 
Who knows? But that's the number the code uses for the _initial_ check of 
"do we have enough memory" (the one that must have passed, since it 
actually started doing things and didn't print out a warning message).

Anyway, from the dmesg, we can see:

	[   41.873619] PM: Shrinking memory...  Restarting tasks ... done.

and this is a clear indication that it's "swsusp_shrink_memory()" that 
failed. If it had succeeded, you'd have seen

	PM: Shrinking memory... done (xyz pages freed)

but it returned an error case, and then the suspend fails and starts 
restarting tasks.

And the thing is, that "swsusp_shrink_memory()" is just full of 
heuristics. There's no hard numbers there. It doesn't seem to wait for 
writeout, it just does the equivalent of "shrink_list()" and 
"shrink_slab()", but it seems to have been basically cribbed half-way 
from the regular "try to free memory", without really doing it all.

Just as an example: it does that "zone_is_all_unreclaimable()" logic that 
expects kswapd to mark things reclaimable again, but it doesn't seem to 
actually ever wait for kswapd or pdflush. It also seems to set 
"swappiness" to zero etc. Maybe it's all intentional, but it does mean 
that it uses some shared heuristics with the "real" VM, but uses them 
differently.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/