[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1001062353570.15070@chino.kir.corp.google.com>
Date: Thu, 7 Jan 2010 00:04:47 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Rusty Russell <rusty@...tcorp.com.au>,
Ingo Molnar <mingo@...hat.com>
cc: Anton Blanchard <anton@...ba.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch 6/6] x86: cpumask_of_node() should handle -1 as a node
On Thu, 7 Jan 2010, Rusty Russell wrote:
> On Thu, 7 Jan 2010 10:21:06 am David Rientjes wrote:
> > On Thu, 7 Jan 2010, Anton Blanchard wrote:
> >
> > > I don't like the use of -1 as a node, but it's much more widespread than
> > > x86; including sh, powerpc, sparc and the generic topology code. eg:
> > >
> > >
> > > #fdef CONFIG_PCI
> > > extern int pcibus_to_node(struct pci_bus *pbus);
> > > #else
> > > static inline int pcibus_to_node(struct pci_bus *pbus)
> > > {
> > > return -1;
> > > }
> >
> > This seems to be the same semantics that NUMA_NO_NODE was defined for,
> > it's not necessarily a special case.
>
> It's widespread, and we've just had another bug due to pcibus_to_node handling
> -1 and cpumask_of_node not. (Search lkml for subject "[Regression] 2.6.33-rc2
> - pci: Commit e0cd516 causes OOPS").
>
That's similiar to the problem in cpumask_of_pcibus() that I fixed with
7715a1e back in September. The difference is that I isolated my fix to
the pci bus implementation that defined the nid of -1 to mean no NUMA
affinity, whereas generic kernel code can use that value for any (or no)
definition and returning cpu_all_mask may not apply. We know it does for
pcibus, but not for generic NUMA node ids that happen to be invald.
The hope is that eventually we can remove many dependencies on node ids
for these purposes; buses with no affinity are not actually members of any
NUMA node. I had a proposal for a generic kernel interface that is based
on ACPI system localities that would define the proximity of any system
entity (of which node is only a type defining "memory") to each other.
I'm waiting for enough time to work on that project.
NUMA is special because a single cpu is always a member of a single node,
so we're violating the bidirectional mapping by saying that node -1 maps
to all cpus while all cpus don't map to node -1. In other words, I think
we should take this as an opportunity to find and fix broken callers as
we've done both by my patch from September and by your aforementioned
case. In this particular case, it would be a matter of doing:
mask = (nid != -1) ? cpumask_of_node(nid) : cpu_all_mask;
I'm hoping that Ingo will weigh in on this topic with his taste and vision
for how we should decouple device locality information from entities that
are not members of any NUMA node if we continue to "special case" these
things.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists