[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52C8765522A740A4A5C027E8FDFFDFE3@jem>
Date: Tue, 21 Sep 2010 09:41:21 +1000
From: "Rob Mueller" <robm@...tmail.fm>
To: "Mel Gorman" <mel@....ul.ie>,
"KOSAKI Motohiro" <kosaki.motohiro@...fujitsu.com>
Cc: <linux-kernel@...r.kernel.org>,
"Bron Gondwana" <brong@...tmail.fm>,
"linux-mm" <linux-mm@...ck.org>,
"Christoph Lameter" <cl@...ux-foundation.org>
Subject: Re: Default zone_reclaim_mode = 1 on NUMA kernel is bad forfile/email/web servers
> I don't think we will ever get the default value for this tunable right.
> I would also worry that avoiding the reclaim_mode for file-backed
> cache will hurt HPC applications that are dumping their data to disk
> and depending on the existing default for zone_reclaim_mode to not
> pollute other nodes.
>
> The ideal would be if distribution packages for mail, web servers
> and others that are heavily IO orientated would prompt for a change
> to the default value of zone_reclaim_mode in sysctl.
I would argue that there's a lot more mail/web/file servers out there than
HPC machines. And HPC machines tend to have a team of people to
monitor/tweak them. I think it would be much more sane to default this to 0
which works best for most people, and get the HPC people to change it.
However there's still another question, why is this problem happening at all
for us? I know almost nothing about NUMA, but from other posts, it sounds
like the problem is the memory allocations are all happening on one node?
But I don't understand why that would be happening. The machine runs the
cyrus IMAP server, which is a classic unix forking server with 1000's of
processes. Each process will mmap lots of different files to access them.
Why would that all be happening on one node, not spread around?
One thing is that the machine is vastly more IO loaded than CPU loaded, in
fact it uses very little CPU at all (a few % usually). Does the kernel
prefer to run processes on one particular node if it's available? So if a
machine has very little CPU load, every process will generally end up
running on the same node?
Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists