lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20100521173940.8f130205.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Fri, 21 May 2010 17:39:40 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	minskey guo <chaohong_guo@...ux.intel.com>
Cc:	Stephen Rothwell <sfr@...b.auug.org.au>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	prarit@...hat.com, andi.kleen@...el.com,
	linux-kernel@...r.kernel.org, minskey guo <chaohong.guo@...el.com>,
	Tejun Heo <tj@...nel.org>, stable@...nel.org
Subject: Re: [PATCH] online CPU before memory failed in pcpu_alloc_pages()

On Fri, 21 May 2010 16:22:19 +0800
minskey guo <chaohong_guo@...ux.intel.com> wrote:

> Yes.  I can use cpu_to_mem().  only some little difference during
> CPU online:  1st cpu within memoryless node gets memory from current
> node or the node to which the cpu0 belongs,
> 
> 
> But I have a question about the patch:
> 
>     numa-slab-use-numa_mem_id-for-slab-local-memory-node.patch,
> 
> 
> 
> 
> @@ -2968,9 +2991,23 @@ static int __build_all_zonelists(void *d
> ...
> 
> -	for_each_possible_cpu(cpu)
> +	for_each_possible_cpu(cpu) {
> 		setup_pageset(&per_cpu(boot_pageset, cpu), 0);
> ...
> 
> +#ifdef CONFIG_HAVE_MEMORYLESS_NODES
> + 	if (cpu_online(cpu))
> +		cpu_to_mem(cpu) = local_memory_node(cpu_to_node(cpu));
> +#endif
> 
> 
> Look at the last two lines, suppose that memory is onlined before CPUs,
> where will cpu_to_mem(cpu) be set to the right nodeid for the last
> onlined cpu ?  Does that CPU always get memory from the node including 
> cpu0 for slab allocator where cpu_to_mem() is used ?
> 
build_all_zonelist() is called at boot, initialization.
And it calls local_memory_node(cpu_to_node(cpu)) for possible cpus.

So, "how cpu_to_node() for possible cpus is configured" is important.
At quick look, root/arch/x86/mm/numa_64.c has following code.


 786 /*
 787  * Setup early cpu_to_node.
 788  *
 789  * Populate cpu_to_node[] only if x86_cpu_to_apicid[],
 790  * and apicid_to_node[] tables have valid entries for a CPU.
 791  * This means we skip cpu_to_node[] initialisation for NUMA
 792  * emulation and faking node case (when running a kernel compiled
 793  * for NUMA on a non NUMA box), which is OK as cpu_to_node[]
 794  * is already initialized in a round robin manner at numa_init_array,
 795  * prior to this call, and this initialization is good enough
 796  * for the fake NUMA cases.
 797  *
 798  * Called before the per_cpu areas are setup.
 799  */
 800 void __init init_cpu_to_node(void)
 801 {
 802         int cpu;
 803         u16 *cpu_to_apicid = early_per_cpu_ptr(x86_cpu_to_apicid);
 804 
 805         BUG_ON(cpu_to_apicid == NULL);
 806 
 807         for_each_possible_cpu(cpu) {
 808                 int node;
 809                 u16 apicid = cpu_to_apicid[cpu];
 810 
 811                 if (apicid == BAD_APICID)
 812                         continue;
 813                 node = apicid_to_node[apicid];
 814                 if (node == NUMA_NO_NODE)
 815                         continue;
 816                 if (!node_online(node))
 817                         node = find_near_online_node(node);
 818                 numa_set_node(cpu, node);
 819         }
 820 }


So, cpu_to_node(cpu) for possible cpus will have NUMA_NO_NODE(-1)
or the number of the nearest node.

IIUC, if SRAT is not broken, all pxm has its own node_id. So,
cpu_to_node(cpu) will return the nearest node and cpu_to_mem() will
find the nearest node with memory.

Thanks,
-Kame



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ