[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.00.0909211704180.4798@chino.kir.corp.google.com>
Date: Mon, 21 Sep 2009 17:19:13 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Benjamin Herrenschmidt <benh@...nel.crashing.org>
cc: Mel Gorman <mel@....ul.ie>, Nick Piggin <npiggin@...e.de>,
Pekka Enberg <penberg@...helsinki.fi>,
Christoph Lameter <cl@...ux-foundation.org>,
heiko.carstens@...ibm.com, sachinp@...ibm.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Tejun Heo <tj@...nel.org>,
Lee Schermerhorn <Lee.Schermerhorn@...com>
Subject: Re: [RFC PATCH 0/3] Fix SLQB on memoryless configurations V2
On Tue, 22 Sep 2009, Benjamin Herrenschmidt wrote:
> So if I understand correctly, we have a problem with both cpu-less and
> memory-less nodes. Interesting setups :-)
>
I agree with Christoph that we need to resolve the larger kernel issue of
memoryless nodes in the kernel and the result of that work will most
likely become the basis from which the slqb fixes originate.
I disagree that we need kernel support for memoryless nodes on x86 and
probably on all architectures period. "NUMA nodes" will always contain
memory by definition and I think hijacking the node abstraction away from
representing anything but memory affinity is wrong in the interest of a
long-term maintainable kernel and will continue to cause issues such as
this in other subsystems.
I do understand the asymmetries of these machines, including the ppc that
is triggering this particular hang with slqb. But I believe the support
can be implemented in a different way: I would offer an alternative
representation based entirely on node distances. This would isolate each
region of memory that has varying affinity to cpus, pci busses, etc., into
nodes and then report a distance, whether local or remote, to other nodes
much in the way the ACPI specification does with proximity domains.
Using node distances instead of memoryless nodes would still be able to
represent all asymmetric machines that currently benefit from the support
by binding devices to memory regions to which they have the closest
affinity and then reporting relative distances to other nodes via
node_distance().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists