[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091001085628.GD15345@elte.hu>
Date: Thu, 1 Oct 2009 10:56:28 +0200
From: Ingo Molnar <mingo@...e.hu>
To: David Rientjes <rientjes@...gle.com>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>, Yinghai Lu <yinghai@...nel.org>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Ankita Garg <ankita@...ibm.com>,
Len Brown <len.brown@...el.com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch 4/4] x86: interleave emulated nodes over physical nodes
* David Rientjes <rientjes@...gle.com> wrote:
> Add interleaved NUMA emulation support
>
> This patch interleaves emulated nodes over the system's physical
> nodes. This is required for interleave optimizations since
> mempolicies, for example, operate by iterating over a nodemask and act
> without knowledge of node distances. It can also be used for testing
> memory latencies and NUMA bugs in the kernel.
>
> There're a couple of ways to do this:
>
> - divide the number of emulated nodes by the number of physical nodes
> and allocate the result on each physical node, or
>
> - allocate each successive emulated node on a different physical node
> until all memory is exhausted.
>
> The disadvantage of the first option is, depending on the asymmetry in
> node capacities of each physical node, emulated nodes may
> substantially differ in size on a particular physical node compared to
> another.
>
> The disadvantage of the second option is, also depending on the
> asymmetry in node capacities of each physical node, there may be more
> emulated nodes allocated on a single physical node as another.
>
> This patch implements the second option; we sacrifice the possibility
> that we may have slightly more emulated nodes on a particular physical
> node compared to another in lieu of node size asymmetry.
>
> [ Note that "node capacity" of a physical node is not only a function of
> its addressable range, but also is affected by subtracting out the
> amount of reserved memory over that range. NUMA emulation only deals
> with available, non-reserved memory quantities. ]
>
> We ensure there is at least a minimal amount of available memory
> allocated to each node. We also make sure that at least this amount of
> available memory is available in ZONE_DMA32 for any node that includes
> both ZONE_DMA32 and ZONE_NORMAL.
>
> This patch also cleans the emulation code up by no longer passing the
> statically allocated struct bootnode array among the various functions.
> This init.data array is not allocated on the stack since it may be very
> large and thus it may be accessed at file scope.
>
> The WARN_ON() for nodes_cover_memory() when faking proximity domains is
> removed since it relies on successive nodes always having greater start
> addresses than previous nodes; with interleaving this is no longer always
> true.
>
> Cc: Yinghai Lu <yinghai@...nel.org>
> Cc: Balbir Singh <balbir@...ux.vnet.ibm.com>
> Cc: Ankita Garg <ankita@...ibm.com>
> Signed-off-by: David Rientjes <rientjes@...gle.com>
> ---
> arch/x86/mm/numa_64.c | 211 ++++++++++++++++++++++++++++++++++++++++++------
> arch/x86/mm/srat_64.c | 1 -
> 2 files changed, 184 insertions(+), 28 deletions(-)
Looks very nice. Peter, Thomas, any objections against queueing this up
in the x86 tree for more testing?
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists