[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <552FFD83.5030901@hp.com>
Date: Thu, 16 Apr 2015 14:20:51 -0400
From: Waiman Long <waiman.long@...com>
To: Mel Gorman <mgorman@...e.de>
CC: Linux-MM <linux-mm@...ck.org>, Nathan Zimmer <nzimmer@....com>,
Daniel Rahn <drahn@...e.com>,
Davidlohr Bueso <dbueso@...e.com>,
Dave Hansen <dave.hansen@...el.com>,
Tom Vaden <tom.vaden@...com>,
Scott Norton <scott.norton@...com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 0/14] Parallel memory initialisation
On 04/15/2015 09:38 AM, Mel Gorman wrote:
>> However, there were 2 bootup problems in the dmesg log that needed
>> to be addressed.
>> 1. There were 2 vmalloc allocation failures:
>> [ 2.284686] vmalloc: allocation failure, allocated 16578404352 of
>> 17179873280 bytes
>> [ 10.399938] vmalloc: allocation failure, allocated 7970922496 of
>> 8589938688 bytes
>>
>> 2. There were 2 soft lockup warnings:
>> [ 57.319453] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s!
>> [swapper/0:1]
>> [ 85.409263] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
>> [swapper/0:1]
>>
>> Once those problems are fixed, the patch should be in a pretty good
>> shape. I have attached the dmesg log for your reference.
>>
> The obvious conclusion is that initialising 1G per node is not enough for
> really large machines. Can you try this on top? It's untested but should
> work. The low value was chosen because it happened to work and I wanted
> to get test coverage on common hardware but broke is broke.
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f2c96d02662f..6b3bec304e35 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -276,9 +276,9 @@ static inline bool update_defer_init(pg_data_t *pgdat,
> if (pgdat->first_deferred_pfn != ULONG_MAX)
> return false;
>
> - /* Initialise at least 1G per zone */
> + /* Initialise at least 32G per node */
> (*nr_initialised)++;
> - if (*nr_initialised> (1UL<< (30 - PAGE_SHIFT))&&
> + if (*nr_initialised> (32UL<< (30 - PAGE_SHIFT))&&
> (pfn& (PAGES_PER_SECTION - 1)) == 0) {
> pgdat->first_deferred_pfn = pfn;
> return false;
>
>
I applied the patch and the boot time was 299s instead of 298s, so
practically the same. The two issues that I discussed about previously
were both gone. Attached is the new dmesg log for your reference.
Cheers,
Longman
View attachment "dmesg-4.0-Mel-mm-patch-2.txt" of type "text/plain" (490329 bytes)
Powered by blists - more mailing lists