linux-kernel - Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150408230740.GB53918@linux.vnet.ibm.com>
Date:	Wed, 8 Apr 2015 16:07:40 -0700
From:	Nishanth Aravamudan <nacc@...ux.vnet.ibm.com>
To:	Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
Cc:	Grant Likely <grant.likely@...aro.org>, devicetree@...r.kernel.org,
	Rob Herring <robh+dt@...nel.org>, linux-kernel@...r.kernel.org,
	sparclinux@...r.kernel.org, linux-mm@...ck.org,
	linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH] of: return NUMA_NO_NODE from fallback of_node_to_nid()

On 08.04.2015 [20:04:04 +0300], Konstantin Khlebnikov wrote:
> On 08.04.2015 19:59, Konstantin Khlebnikov wrote:
> >Node 0 might be offline as well as any other numa node,
> >in this case kernel cannot handle memory allocation and crashes.

Isn't the bug that numa_node_id() returned an offline node? That
shouldn't happen.

#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
...
#ifndef numa_node_id
/* Returns the number of the current Node. */
static inline int numa_node_id(void)
{
        return raw_cpu_read(numa_node);
}
#endif
...
#else   /* !CONFIG_USE_PERCPU_NUMA_NODE_ID */

/* Returns the number of the current Node. */
#ifndef numa_node_id
static inline int numa_node_id(void)
{
        return cpu_to_node(raw_smp_processor_id());
}
#endif
...

So that's either the per-cpu numa_node value, right? Or the result of
cpu_to_node on the current processor.

> Example:
> 
> [    0.027133] ------------[ cut here ]------------
> [    0.027938] kernel BUG at include/linux/gfp.h:322!

This is 

VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid));

in

alloc_pages_exact_node().

And based on the trace below, that's

__slab_alloc -> alloc

alloc_pages_exact_node
	<- alloc_slab_page
		<- allocate_slab
			<- new_slab
				<- new_slab_objects
					< __slab_alloc?

which is just passing the node value down, right? Which I think was
from:

        domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
                              GFP_KERNEL, of_node_to_nid(of_node));

?

What platform is this on, looks to be x86? qemu emulation of a
pathological topology? What was the topology?

Note that there is a ton of code that seems to assume node 0 is online.
I started working on removing this assumption myself and it just led
down a rathole (on power, we always have node 0 online, even if it is
memoryless and cpuless, as a result).

I am guessing this is just happening early in boot before the per-cpu
areas are setup? That's why (I think) x86 has the early_cpu_to_node()
function...

Or do you not have CONFIG_OF set? So isn't the only change necessary to
the include file, and it should just return first_online_node rather
than 0?

Ah and there's more of those node 0 assumptions :)

#define first_online_node       0
#define first_memory_node       0

if MAX_NUMODES == 1...

-Nish

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/