[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070208002809.c75b2742.kamezawa.hiroyu@jp.fujitsu.com>
Date: Thu, 8 Feb 2007 00:28:09 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Christoph Lameter <clameter@....com>
Cc: ak@...e.de, linux-kernel@...r.kernel.org, y-goto@...fujitsu.com,
clameter@...r.sgi.com, akpm@...l.org
Subject: Re: [2.6.20][PATCH] fix mempolicy error check on a system with
memory-less-node
On Wed, 7 Feb 2007 06:05:56 -0800 (PST)
Christoph Lameter <clameter@....com> wrote:
> On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote:
>
> > > IMHO there shouldn't be any memory less nodes. The architecture code
> > > should not create them. The CPU should be assigned to a nearby node instead.
> > > At least x86-64 ensures that.
> > >
> > AFAIK, ia64 creates nodes just depends on SRAT's possible resource information.
> > Then, ia64 can create cpu-memory-less-node(node with no available resource.).
> > (*)I don't like this.
>
> I think that is only true for !SN2 platforms? Could we fix this?
>
AFAIK, some vendor(HP?) has following configraion
- node0 .... cpu only node
- node1 .... cpu only node
- node2 .... memory only node.
This is because of their memory-interleave technique.
Our 64cpu socket NUMA system also has a config
- node0 cpu+memory node
- node 1 - 7 cpu only node.
for deviding scheduler domain.(old kernel had problem with big-sched-domain)
To fix memory-less-node, we have to test the performance of
"very-big-scheduler-domain" and to define the rule for cpu-hot-add, as
"a new cpu will be added to the most nearby node"
(node-hot-add will have to add some hook..)
I don't know someone who created memory-less-node in past may have some other issues.
There may be some complicated topology system with complicated PXM map.
> > If we don't allow memory-less-node, we may have to add several codes for cpu-hot-add.
> > cpus should be moved to nearby node at hotadd .
> > And node-hot-add have to care that cpus mustn't be added before memory, cpu-driven
> > node-hot-add will never occur. (ACPI's 'container' device spec can't guaranntee this.)
>
> Well you could bring down the cpu and bring it up again? This would also
> assure the best placement of the runtime structures for node?
>
cpu-to-node relationship is fixed in the early stage of cpu hotplug.
I'm not sure we can bring down/up cpu again in clean way. After a cpu is added,
the kernel losts its original PXM value now.
about runtime structures:
The runtime structure placement for a hot-added-node is another issue here.
I and Goto-san have a plan for optimized placement of structures and will
try when we can do. (We are now assgined to RHEL5 stabilization tasks...)
Moving per-cpu-area at hotadd does not look easy.
IMHO, maybe we have to use stop_machine_run() to move it.
Anyway, I'll post an another *easy* patch just for fix the NULL pointer access.
please review.
Thanks,
-Kame
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists