[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A0894A5.9000209@zytor.com>
Date: Mon, 11 May 2009 14:12:05 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: David Rientjes <rientjes@...gle.com>
CC: Jack Steiner <steiner@....com>, Yinghai Lu <yinghai@...nel.org>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Andi Kleen <andi@...stfloor.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/3] x86: fix node_possible_map logic -v2
David Rientjes wrote:
>
> In your example of two cpus (0-1) that are remote to the system's only
> memory and two cpus (2-3) that have affinity to that memory, it appears as
> though the kernel is considering cpus 2-3 and the memory to be a node and
> cpus 0-1 to be a memoryless node.
>
> That's a pretty useless scenario for memoryless node support, actually,
> unless there's a third node with memory that cpus 0-1 have a different
> distance to. cpus 0-1 have no memory that is local, so the "remote"
> memory should be considered local to them.
>
Should it? It seems to me that CPUs 0-1 should be antipreferentially
scheduled, since they will have slower access to the memory than CPUs
2-3. Since in this case all the memory is in the same place you could
argue that SMP distances could do the same job, which is of course true.
However, consider now:
CPU [0-1] - no memory
CPU [2-3] - memory
CPU [4-5] - memory
Each node is equidistant, but for the memory nodes there is differences
between their own local memory and the remote memory.
CPU [0-1] cannot be considered local in either node, since they are
further away from the memory than either, and furthermore, unlike either
of the memory nodes, they have no preference for memory from either of
the other two nodes (quite on the contrary; they would probably benefit
from drawing from both.)
> I don't know who has been pushing the memoryless node support, but it
> appears as though it hasn't been fully tested yet. The NULL
> pglist_data here for node 0 seems appropriate since you don't need it
> unless you're describing memory, but the kernel implies that if a bit
> is set in node_online_map or node_possible_map that it has this
> associated data.
No doubt there is still bugs.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists