[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A0B0C88.6010306@kernel.org>
Date: Wed, 13 May 2009 11:08:08 -0700
From: Yinghai Lu <yinghai@...nel.org>
To: Jack Steiner <steiner@....com>, Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mel@....ul.ie>
CC: "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
David Rientjes <rientjes@...gle.com>,
Andi Kleen <andi@...stfloor.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Rusty Russell <rusty@...tcorp.com.au>,
Mike Travis <travis@....com>
Subject: Re: [PATCH] x86: fix system without memory on node0
Jack Steiner wrote:
> On Tue, May 12, 2009 at 06:34:31PM -0700, Yinghai Lu wrote:
>> Jack found that crash with doesn't have memory on node0.
>>
>> it turns out with per_cpu changeset, node_number for BSP will be alway 0,
>> and it is consistent to cpu_to_node() that is to near node already.
>> aka when numa_set_node() for node0 is called early before per_cpu area is
>> setup
>>
>> try to set the node_number for boot cpu, after we get per_cpu area setup.
>>
>> [ Impact: fix crashing on memoryless node 0]
>>
>> Reported-by: Jack Steiner <steiner@....com>
>> Signed-off-by: Yinghai Lu <yinghai@...nel.org>
>>
>> ---
>> arch/x86/kernel/setup_percpu.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> Index: linux-2.6/arch/x86/kernel/setup_percpu.c
>> ===================================================================
>> --- linux-2.6.orig/arch/x86/kernel/setup_percpu.c
>> +++ linux-2.6/arch/x86/kernel/setup_percpu.c
>> @@ -423,6 +423,14 @@ void __init setup_per_cpu_areas(void)
>> early_per_cpu_ptr(x86_cpu_to_node_map) = NULL;
>> #endif
>>
>> +#if defined(CONFIG_X86_64) && defined(CONFIG_NUMA)
>> + /*
>> + * make sure boot cpu node_number is right, when boot cpu is on the
>> + * node that doesn't have mem installed
>> + */
>> + per_cpu(node_number, boot_cpu_id) = cpu_to_node(boot_cpu_id);
>> +#endif
>> +
>> /* Setup node to cpumask map */
>> setup_node_to_cpumask_map();
>>
>
> With the patch above PLUS the patch below, I verified that all of our strange
> configurations boot to shell prompt & run simple commands. There are certainly
> some corner cases that have not been tested.
>
> Note that both patches are required. The system panics in early boot if either
> patch is omitted.
>
> ---
>
>
> Ignore offline nodes when building the zone lists. This
> fix is needed to support configurations that hax PXMs with
> cpus but no memory.
>
>
> Signed-off-by: Jack Steiner <steiner@....com>
>
>
> ---
> mm/page_alloc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> Index: linux/mm/page_alloc.c
> ===================================================================
> --- linux.orig/mm/page_alloc.c 2009-05-12 17:06:59.000000000 -0500
> +++ linux/mm/page_alloc.c 2009-05-13 09:54:09.000000000 -0500
> @@ -2370,6 +2370,8 @@ static void build_zonelists(pg_data_t *p
> * If another node is sufficiently far away then it is better
> * to reclaim pages in a zone before going off node.
> */
> + if (!node_online(node))
> + continue;
> if (distance > RECLAIM_DISTANCE)
> zone_reclaim_mode = 1;
>
can you try this instead of your patch ?
{PATCH] mm: clear N_HIGH_MEMORY map before se set it again
incase some system strange SRAT table. some kind of small range.
Signed-off-by: Yinghai Lu <Yinghai@...nel.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -4041,6 +4041,11 @@ void __init free_area_init_nodes(unsigne
early_node_map[i].start_pfn,
early_node_map[i].end_pfn);
+ /*
+ * find_zone_movable_pfns_for_nodes/early_calculate_totalpages init
+ * that node_mask, clear it at first
+ */
+ nodes_clear(nodes_state[N_HIGH_MEMORY]);
/* Initialise every node */
mminit_verify_pageflags_layout();
setup_nr_node_ids();
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists