lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Sep 2010 22:42:20 +1000
From:	"Bron Gondwana" <brong@...tmail.fm>
To:	"Christoph Lameter" <cl@...ux.com>,
	"Robert Mueller" <robm@...tmail.fm>
Cc:	"KOSAKI Motohiro" <kosaki.motohiro@...fujitsu.com>,
	"Mel Gorman" <mel@....ul.ie>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"linux-mm" <linux-mm@...ck.org>
Subject: Re: Default zone_reclaim_mode = 1 on NUMA kernel is bad
 forfile/email/web servers

On Tue, 28 Sep 2010 07:35 -0500, "Christoph Lameter" <cl@...ux.com> wrote:
> > The problem we saw was purely with file caching. The application wasn't
> > actually allocating much memory itself, but it was reading lots of files
> > from disk (via mmap'ed memory mostly), and as most people would, we
> > expected that data would be cached in memory to reduce future reads from
> > disk. That was not happening.
> 
> Obviously and you have stated that numerous times. Problem that the use
> of
> a remote memory will reduced performance of reads so the OS (with
> zone_reclaim=1) defaults to the use of local memory and favors reclaim of
> local memory over the allocation from the remote node. This is fine if
> you have multiple applications running on both nodes because then each
> application will get memory local to it and therefore run faster. That
> does not work with a single app that only allocates from one node.

Is this what's happening, or is IO actually coming from disk in preference
to the remote node?  I can certainly see the logic behind preferring to
reclaim the local node if that's all that's happening - though the OS should
be allocating the different tasks more evenly across the nodes in that case.

> Control over memory allocations over the various nodes under NUMA
> for a process can occur via the numactl ctl or the libnuma C apis.
> 
> F.e.e
> 
> numactl --interleave ... command
> 
> will address that issue for a specific command that needs to go

Gosh what a pain.  While it won't kill us too much to add to our
startup, it does feel a lot like the tail is wagging the dog from here
still.  A task that doesn't ask for anything special should get sane
defaults, and the cost of data from the other node should be a lot
less than the cost of the same data from spinning rust.

Bron.
-- 
  Bron Gondwana
  brong@...tmail.fm

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ