[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.00.0909220132250.19097@chino.kir.corp.google.com>
Date: Tue, 22 Sep 2009 01:44:58 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Benjamin Herrenschmidt <benh@...nel.crashing.org>
cc: Christoph Lameter <cl@...ux-foundation.org>,
Mel Gorman <mel@....ul.ie>, Nick Piggin <npiggin@...e.de>,
Pekka Enberg <penberg@...helsinki.fi>,
heiko.carstens@...ibm.com, sachinp@...ibm.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Tejun Heo <tj@...nel.org>,
Lee Schermerhorn <Lee.Schermerhorn@...com>
Subject: Re: [RFC PATCH 0/3] Fix SLQB on memoryless configurations V2
On Tue, 22 Sep 2009, Benjamin Herrenschmidt wrote:
> While I like the idea of NUMA nodes being strictly memory and everything
> else being expressed by distances, we'll have to clean up quite a few
> corners with skeletons in various states of decompositions waiting for
> us there.
>
Agreed, it's invasive.
> For example, we have code here or there that (ab)uses the NUMA node
> information to link devices with their iommu, that sort of thing. IE, a
> hard dependency which isn't really related to a concept of distance to
> any memory.
>
ACPI's slit uses a distance of 0xff to specify that one locality is
unreachable from another. We could easily adopt that convention.
> At least on powerpc, nowadays, I can pretty much make everything
> fallback to some representation in the device-tree though, thus it
> shouldn't be -that- hard to fix I suppose.
>
Cool, that's encouraging.
I really think that this type of abstraction would make things simpler in
the long term. For example, I just finished fixing a bug in tip where
cpumask_of_pcibus() wasn't returning cpu_all_mask for busses without any
affinity on x86. This was a consequence of cpumask_of_pcibus() being
forced to rely on pcibus_to_node() since there is no other abstraction
available. For busses without affinity to any specific cpus, the
implementation had relied on returning the mapping's default node of -1 to
represent all cpus. That type of complexity could easily be avoided if
the bus was isolated into its own locality and the mapping to all cpu
localities was of local distance.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists