[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140114014516.GC4327@dhcp-16-126.nay.redhat.com>
Date: Tue, 14 Jan 2014 09:45:16 +0800
From: Dave Young <dyoung@...hat.com>
To: Prarit Bhargava <prarit@...hat.com>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Len Brown <lenb@...nel.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Linn Crosetto <linn@...com>, Pekka Enberg <penberg@...nel.org>,
Yinghai Lu <yinghai@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Toshi Kani <toshi.kani@...com>,
Tang Chen <tangchen@...fujitsu.com>,
Wen Congyang <wency@...fujitsu.com>,
Vivek Goyal <vgoyal@...hat.com>, kosaki.motohiro@...il.com,
linux-acpi@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] x86, acpi memory hotplug, add parameter to disable
memory hotplug
On 01/13/14 at 04:56pm, Prarit Bhargava wrote:
> When booting a kexec/kdump kernel on a system that has specific memory hotplug
> regions the boot will fail with warnings like:
>
> [ 2.939467] swapper/0: page allocation failure: order:9, mode:0x84d0
> [ 2.946564] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 3.10.0-65.el7.x86_64 #1
> [ 2.954532] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS
> QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011
> [ 2.964926] 0000000000000000 ffff8800341bd8c8 ffffffff815bcc67
> ffff8800341bd950
> [ 2.973224] ffffffff8113b1a0 ffff880036339b00 0000000000000009
> 00000000000084d0
> [ 2.981523] ffff8800341bd950 ffffffff815b87ee 0000000000000000
> 0000000000000200
> [ 2.989821] Call Trace:
> [ 2.992560] [<ffffffff815bcc67>] dump_stack+0x19/0x1b
> [ 2.998300] [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160
> [ 3.004817] [<ffffffff815b87ee>] ?
> __alloc_pages_direct_compact+0xac/0x196
> [ 3.012594] [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00
> [ 3.019692] [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba
> [ 3.026303] [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b
> [ 3.033302] [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b
> [ 3.039718] [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35
> [ 3.046717] [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185
> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240
> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0
> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0
> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d
> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd
> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f
> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5
> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d
> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160
> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6
> [ 3.131987] [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a
> [ 3.138889] [<ffffffff810020e2>] do_one_initcall+0xe2/0x190
> [ 3.145210] [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207
> [ 3.152111] [<ffffffff819e18d0>] ? do_early_param+0x88/0x88
> [ 3.158430] [<ffffffff8159fea0>] ? rest_init+0x80/0x80
> [ 3.164264] [<ffffffff8159feae>] kernel_init+0xe/0x180
> [ 3.170097] [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0
> [ 3.176123] [<ffffffff8159fea0>] ? rest_init+0x80/0x80
> [ 3.181956] Mem-Info:
> [ 3.184490] Node 0 DMA per-cpu:
> [ 3.188007] CPU 0: hi: 0, btch: 1 usd: 0
> [ 3.193353] Node 0 DMA32 per-cpu:
> [ 3.197060] CPU 0: hi: 42, btch: 7 usd: 0
> [ 3.202410] active_anon:0 inactive_anon:0 isolated_anon:0
> [ 3.202410] active_file:0 inactive_file:0 isolated_file:0
> [ 3.202410] unevictable:0 dirty:0 writeback:0 unstable:0
> [ 3.202410] free:872 slab_reclaimable:13 slab_unreclaimable:1880
> [ 3.202410] mapped:0 shmem:0 pagetables:0 bounce:0
> [ 3.202410] free_cma:0
>
> because the system has run out of memory at boot time. This occurs
> because of the following sequence in the boot:
>
> Main kernel boots and sets E820 map. The second kernel is booted with a
> map generated by the kdump service using memmap= and memmap=exactmap.
> These parameters are added to the kernel parameters of the kexec/kdump
> kernel. The kexec/kdump kernel has limited memory resources so as not
> to severely impact the main kernel.
>
> The system then panics and the kdump/kexec kernel boots (which is a
> completely new kernel boot). During this boot ACPI is initialized and the
> kernel (as can be seen above) traverses the ACPI namespace and finds an
> entry for a memory device to be hotadded.
>
> ie)
>
> [ 3.053720] [<ffffffff815a1e9f>] __add_pages+0xaf/0x240
> [ 3.059656] [<ffffffff81047359>] arch_add_memory+0x59/0xd0
> [ 3.065877] [<ffffffff815a21d9>] add_memory+0xb9/0x1b0
> [ 3.071713] [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d
> [ 3.078813] [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd
> [ 3.085719] [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f
> [ 3.092716] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
> [ 3.100004] [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
> [ 3.107293] [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5
> [ 3.113904] [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d
> [ 3.119933] [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160
> [ 3.126153] [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6
>
> At this point the kernel adds page table information and the the kexec/kdump
> kernel runs out of memory.
>
> This can also be reproduced with a "regular" kernel by using the
> memmap=exactmap and mem=X parameters on the main kernel and booting.
>
> This patchset resolves the problem by adding a kernel parameter,
> acpi_no_memhotplug, to disable ACPI memory hotplug. ACPI memory hotplug
> should also be disabled by default when a user specified a memory mapping with
> "memmap=exactmap" or "mem=X".
>
> Signed-off-by: Prarit Bhargava <prarit@...hat.com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Cc: x86@...nel.org
> Cc: Len Brown <lenb@...nel.org>
> Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>
> Cc: Linn Crosetto <linn@...com>
> Cc: Pekka Enberg <penberg@...nel.org>
> Cc: Yinghai Lu <yinghai@...nel.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Toshi Kani <toshi.kani@...com>
> Cc: Tang Chen <tangchen@...fujitsu.com>
> Cc: Wen Congyang <wency@...fujitsu.com>
> Cc: Vivek Goyal <vgoyal@...hat.com>
> Cc: kosaki.motohiro@...il.com
> Cc: dyoung@...hat.com
> Cc: Toshi Kani <toshi.kani@...com>
> Cc: linux-acpi@...r.kernel.org
> Cc: linux-mm@...ck.org
> ---
> Documentation/kernel-parameters.txt | 3 +++
> arch/x86/kernel/e820.c | 4 ++++
> drivers/acpi/acpi_memhotplug.c | 18 ++++++++++++++++++
> include/linux/memory_hotplug.h | 9 +++++++++
> 4 files changed, 34 insertions(+)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index b9e9bd8..ea93f75 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -343,6 +343,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> no: ACPI OperationRegions are not marked as reserved,
> no further checks are performed.
>
> + acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kexec
> + and kdump kernels.
> +
> add_efi_memmap [EFI; X86] Include EFI memory map in
> kernel's map of available physical RAM.
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 174da5f..3c431fe 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -20,6 +20,7 @@
> #include <linux/firmware-map.h>
> #include <linux/memblock.h>
> #include <linux/sort.h>
> +#include <linux/memory_hotplug.h>
>
> #include <asm/e820.h>
> #include <asm/proto.h>
> @@ -834,6 +835,8 @@ static int __init parse_memopt(char *p)
> return -EINVAL;
> e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1);
>
> + set_acpi_no_memhotplug();
> +
> return 0;
> }
> early_param("mem", parse_memopt);
> @@ -857,6 +860,7 @@ static int __init parse_memmap_one(char *p)
> #endif
> e820.nr_map = 0;
> userdef = 1;
> + set_acpi_no_memhotplug();
> return 0;
> }
>
> diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> index 551dad7..d104a7d 100644
> --- a/drivers/acpi/acpi_memhotplug.c
> +++ b/drivers/acpi/acpi_memhotplug.c
> @@ -361,7 +361,25 @@ static void acpi_memory_device_remove(struct acpi_device *device)
> acpi_memory_device_free(mem_device);
> }
>
> +static bool acpi_no_memhotplug;
> +
> +void set_acpi_no_memhotplug(void)
> +{
> + acpi_no_memhotplug = true;
> + pr_info_once("ACPI: Memory Hotplug Disabled\n");
> +}
> +
> void __init acpi_memory_hotplug_init(void)
> {
> + if (acpi_no_memhotplug)
> + return;
> +
> acpi_scan_add_handler_with_hotplug(&memory_device_handler, "memory");
> }
> +
> +static int __init disable_acpi_memory_hotplug(char *str)
> +{
> + set_acpi_no_memhotplug();
> + return 1;
> +}
> +__setup("acpi_no_memhotplug", disable_acpi_memory_hotplug);
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 4ca3d95..3cdb6e0 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -12,6 +12,15 @@ struct pglist_data;
> struct mem_section;
> struct memory_block;
>
> +#ifdef CONFIG_ACPI_HOTPLUG_MEMORY
> +/* set flag to disable ACPI memory hotplug */
> +extern void set_acpi_no_memhotplug(void);
> +#else
> +static inline void set_acpi_no_memhotplug(void)
> +{
> +}
> +#endif
> +
> #ifdef CONFIG_MEMORY_HOTPLUG
>
> /*
> --
> 1.7.9.3
>
Acked-by: Dave Young <dyoung@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists