linux-kernel - Re: s2disk hang update

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b2b86521002240823t126d5ad8nbd292da0f4090e6c@mail.gmail.com>
Date:	Wed, 24 Feb 2010 16:23:46 +0000
From:	Alan Jenkins <sourcejedi.lkml@...glemail.com>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
Cc:	Mel Gorman <mel@....ul.ie>, hugh.dickins@...cali.co.uk,
	Pavel Machek <pavel@....cz>,
	pm list <linux-pm@...ts.linux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Kernel Testers List <kernel-testers@...r.kernel.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Linux MM <linux-mm@...ck.org>
Subject: Re: s2disk hang update

On 2/23/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
> On Tuesday 23 February 2010, Alan Jenkins wrote:
>> On 2/22/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
>> > On Monday 22 February 2010, Alan Jenkins wrote:
>> >> Rafael J. Wysocki wrote:
>> >> > On Friday 19 February 2010, Alan Jenkins wrote:
>> >> >
>> >> >> On 2/18/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
>> >> >>
>> >> >>> On Thursday 18 February 2010, Alan Jenkins wrote:
>> >> >>>
>> >> >>>> On 2/17/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
>> >> >>>>
>> >> >>>>> On Wednesday 17 February 2010, Alan Jenkins wrote:
>> >> >>>>>
>> >> >>>>>> On 2/16/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
>> >> >>>>>>
>> >> >>>>>>> On Tuesday 16 February 2010, Alan Jenkins wrote:
>> >> >>>>>>>
>> >> >>>>>>>> On 2/16/10, Alan Jenkins <sourcejedi.lkml@...glemail.com>
>> >> >>>>>>>> wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>> On 2/15/10, Rafael J. Wysocki <rjw@...k.pl> wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>> On Tuesday 09 February 2010, Alan Jenkins wrote:
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>> Perhaps I spoke too soon.  I see the same hang if I run too
>> >> >>>>>>>>>>> many
>> >> >>>>>>>>>>> applications.  The first hibernation fails with "not enough
>> >> >>>>>>>>>>> swap"
>> >> >>>>>>>>>>> as
>> >> >>>>>>>>>>> expected, but the second or third attempt hangs (with the
>> >> >>>>>>>>>>> same
>> >> >>>>>>>>>>> backtrace
>> >> >>>>>>>>>>> as before).
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>> The patch definitely helps though.  Without the patch, I
>> >> >>>>>>>>>>> see a
>> >> >>>>>>>>>>> hang
>> >> >>>>>>>>>>> the
>> >> >>>>>>>>>>> first time I try to hibernate with too many applications
>> >> >>>>>>>>>>> running.
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>> Well, I have an idea.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Can you try to apply the appended patch in addition and see
>> >> >>>>>>>>>> if
>> >> >>>>>>>>>> that
>> >> >>>>>>>>>> helps?
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Rafael
>> >> >>>>>>>>>>
>> >> >>>>>>>>> It doesn't seem to help.
>> >> >>>>>>>>>
>> >> >>>>>>>> To be clear: It doesn't stop the hang when I hibernate with
>> >> >>>>>>>> too
>> >> >>>>>>>> many
>> >> >>>>>>>> applications.
>> >> >>>>>>>>
>> >> >>>>>>>> It does stop the same hang in a different case though.
>> >> >>>>>>>>
>> >> >>>>>>>> 1. boot with init=/bin/bash
>> >> >>>>>>>> 2. run s2disk
>> >> >>>>>>>> 3. cancel the s2disk
>> >> >>>>>>>> 4. repeat steps 2&3
>> >> >>>>>>>>
>> >> >>>>>>>> With the patch, I can run 10s of iterations, with no hang.
>> >> >>>>>>>> Without the patch, it soon hangs, (in disable_nonboot_cpus(),
>> >> >>>>>>>> as
>> >> >>>>>>>> always).
>> >> >>>>>>>>
>> >> >>>>>>>> That's what happens on 2.6.33-rc7.  On 2.6.30, there is no
>> >> >>>>>>>> problem.
>> >> >>>>>>>> On 2.6.31 and 2.6.32 I don't get a hang, but dmesg shows an
>> >> >>>>>>>> allocation
>> >> >>>>>>>> failure after a couple of iterations ("kthreadd: page
>> >> >>>>>>>> allocation
>> >> >>>>>>>> failure. order:1, mode:0xd0").  It looks like it might be the
>> >> >>>>>>>> same
>> >> >>>>>>>> stop_machine thread allocation failure that causes the hang.
>> >> >>>>>>>>
>> >> >>>>>>> Have you tested it alone or on top of the previous one?  If
>> >> >>>>>>> you've
>> >> >>>>>>> tested it
>> >> >>>>>>> alone, please apply the appended one in addition to it and
>> >> >>>>>>> retest.
>> >> >>>>>>>
>> >> >>>>>>> Rafael
>> >> >>>>>>>
>> >> >>>>>> I did test with both patches applied together -
>> >> >>>>>>
>> >> >>>>>> 1. [Update] MM / PM: Force GFP_NOIO during suspend/hibernation
>> >> >>>>>> and
>> >> >>>>>> resume
>> >> >>>>>> 2. "reducing the number of pages that we're going to keep
>> >> >>>>>> preallocated
>> >> >>>>>> by
>> >> >>>>>> 20%"
>> >> >>>>>>
>> >> >>>>> In that case you can try to reduce the number of preallocated
>> >> >>>>> pages
>> >> >>>>> even
>> >> >>>>> more,
>> >> >>>>> ie. change "/ 5" to "/ 2" (for example) in the second patch.
>> >> >>>>>
>> >> >>>> It still hangs if I try to hibernate a couple of times with too
>> >> >>>> many
>> >> >>>> applications.
>> >> >>>>
>> >> >>> Hmm.  I guess I asked that before, but is this a 32-bit or 64-bit
>> >> >>> system and
>> >> >>> how much RAM is there in the box?
>> >> >>>
>> >> >>> Rafael
>> >> >>>
>> >> >> EeePC 701.  32 bit.  512Mb RAM.  350Mb swap file, on a "first-gen"
>> >> >> SSD.
>> >> >>
>> >> >
>> >> > Hmm.  I'd try to make  free_unnecessary_pages() free all of the
>> >> > preallocated
>> >> > pages and see what happens.
>> >> >
>> >>
>> >> It still hangs in hibernation_snapshot() / disable_nonboot_cpus().
>> >> After apparently freeing over 400Mb / 100,000 pages of preallocated
>> >> ram.
>> >>
>> >>
>> >>
>> >> There is a change which I missed before.  When I applied your first
>> >> patch ("Force GFP_NOIO during suspend" etc.), it did change the hung
>> >> task backtraces a bit.  I don't know if it tells us anything.
>> >>
>> >> Without the patch, there were two backtraces.  The first backtrace
>> >> suggested a problem allocating pages for a kernel thread (at
>> >> copy_process() / try_to_free_pages()).  The second showed that this
>> >> problem was blocking s2disk (at hibernation_snapshot() /
>> >> disable_nonboot_cpus() / stop_machine_create()).
>> >>
>> >> With the GFP_NOIO patch, I see only the s2disk backtrace.
>> >
>> > Can you please post this backtrace?
>>
>> Sure.  It's rather like the one I posted before, except
>>
>> a) it only shows the one hung task (s2disk)
>> b) this time I had lockdep enabled
>> c) this time most of the lines don't have question marks.
>
> Well, it still looks like we're waiting for create_workqueue_thread() to
> return, which probably is trying to allocate memory for the thread
> structure.
>
> My guess is that the preallocated memory pages freed by
> free_unnecessary_pages() go into a place from where they cannot be taken for
> subsequent NOIO allocations.  I have no idea why that happens though.
>
> To test that theory you can try to change GFP_IOFS to GFP_KERNEL in the
> calls to clear_gfp_allowed_mask() in kernel/power/hibernate.c (and in
> kernel/power/suspend.c for completness).

Effectively forcing GFP_NOWAIT, so the allocation should fail instead
of hanging?

It seems to stop the hang, but I don't see any other difference - the
hibernation process isn't stopped earlier, and I don't get any new
kernel messages about allocation failures.  I wonder if it's because
GFP_NOWAIT triggers ALLOC_HARDER.

I have other evidence which argues for your theory:

[ successful s2disk, with forced NOIO (but not NOWAIT), and test code
as attached ]

 Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
 1280 GFP_NOWAIT allocations of order 0 are possible
 640 GFP_NOWAIT allocations of order 1 are possible
 320 GFP_NOWAIT allocations of order 2 are possible

[ note - 1280 pages is the maximum test allocation used here.  The
test code is only accurate when talking about smaller numbers of free
pages ]

 1280 GFP_KERNEL allocations of order 0 are possible
 640 GFP_KERNEL allocations of order 1 are possible
 320 GFP_KERNEL allocations of order 2 are possible

 PM: Preallocating image memory...
 212 GFP_NOWAIT allocations of order 0 are possible
 102 GFP_NOWAIT allocations of order 1 are possible
 50 GFP_NOWAIT allocations of order 2 are possible

 Freeing all 90083 preallocated pages
 (and 0 highmem pages, out of 0)
 190 GFP_NOWAIT allocations of order 0 are possible
 102 GFP_NOWAIT allocations of order 1 are possible
 50 GFP_NOWAIT allocations of order 2 are possible
 1280 GFP_KERNEL allocations of order 0 are possible
 640 GFP_KERNEL allocations of order 1 are possible
 320 GFP_KERNEL allocations of order 2 are possible
 done (allocated 90083 pages)

It looks like you're right and the freed pages are not accessible with
GFP_NOWAIT for some reason.

I also tried a number of test runs with too many applications, and saw this:

Freeing all 104006 preallocated pages ...
65 GFP_NOWAIT allocations of order 0 ...
18 GFP_NOWAIT allocations of order 1 ...
9 GFP_NOWAIT allocations of order 2 ...
0 GFP_KERNEL allocations of order 0 are possible
...
Disabling nonboot cpus ...
...
PM: Hibernation image created
Force enabled HPET at resume
PM: early thaw of devices complete after ... msecs

<hang, no backtrace visible even after 120 seconds>

I'm not bothered by the new hang; the test code will inevitably have
some side effects.  I'm not sure why GFP_KERNEL allocations would fail
in this scenario though...  perhaps the difference is that we've
swapped out the entire userspace so GFP_IO doesn't help.

Regards
Alan

View attachment "check-free.patch" of type "text/x-patch" (2446 bytes)