lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 17 Oct 2008 10:07:46 +0200 (CEST)
From:	Oliver Weihe <o.weihe@...tacomputer.de>
To:	Christoph Lameter <cl@...ux-foundation.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: Fw: NUMA allocator on Opteron systems does non-local allocation on node0

Hi,

this problem/question is allready solved for me. Andi suggested to post
this on the linux-mm mailing list and they helped me. :)

> > I've notived that the memory allocation on NUMA systems (Opterons)
> > does
> > memory allocation on non-local nodes for processes running node0
> > even if
> > local memory is available. (Kernel 2.6.25 and above)
> 
> How much local memory is available? 8GB per node? That means there
> will be 4GB
> on node 0 in ZONE_DMA32 and 4GB in ZONE_NORMAL. Other nodes will have
> 8GB in
> ZONE_NORMAL.

You're right. This machine has 8GiB per node. Due to the memory hole the
machine has ~3GiB ZONE_DMA32 which perfectly fits to my observations.


> > In my setup I'm allocating an array of ~7GiB memory size in a
> > singlethreaded application.
> > Startup: numactl --cpunodebind=X ./app
> > For X=1,2,3 it works as expected, all memory is allocated on the
> > local
> > node.
> > For X=0 I can see the memory beeing allocated on node0 as long as
> > ~3GiB
> > are "free" on node0. At this point the kernel starts using memory
> > from
> > node1 for the app!
> 
> NUMA only supports memory policies for the highest zone which is
> ZONE_NORMAL here. Only 4GB of ZONE_NORMAL are available on node 0, so
> it will
> go off node after that memory is exhausted. This is done in order to
> preserve
> the lower 4GB for I/O to 32 bit devices.

I've changed the policy from "default" to "node"
(/proc/sys/vm/numa_zonelist_order) and now it works fine for me.
Policy "default" does automaticly select "node" or "zone" depending on
the machine. When the policy is set to "default" the kernel (2.6.27)
chooses "node" if
1. there is no ZONE_DMA32
2. the size of ZONE_DMA32 is greater than 50% of the system memory
3. the size of ZONE_DMA32 is greater than 60% of the nodelocal memory


-- 

Regards,
Oliver Weihe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ