[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D05DB80B95B23498C72C700BD6C2E0B2EF6E127@pdsmsx502.ccr.corp.intel.com>
Date: Tue, 19 May 2009 09:16:25 +0800
From: "Zhang, Yanmin" <yanmin.zhang@...el.com>
To: "Wu, Fengguang" <fengguang.wu@...el.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
CC: LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>,
Christoph Lameter <cl@...ux-foundation.org>
Subject: RE: [PATCH 4/4] zone_reclaim_mode is always 0 by default
>>-----Original Message-----
>>From: Wu, Fengguang
>>Sent: 2009年5月18日 11:49
>>To: KOSAKI Motohiro
>>Cc: LKML; linux-mm; Andrew Morton; Rik van Riel; Christoph Lameter; Zhang,
>>Yanmin
>>Subject: Re: [PATCH 4/4] zone_reclaim_mode is always 0 by default
>>
>>On Wed, May 13, 2009 at 12:08:12PM +0900, KOSAKI Motohiro wrote:
>>> Subject: [PATCH] zone_reclaim_mode is always 0 by default
>>>
>>> Current linux policy is, if the machine has large remote node distance,
>>> zone_reclaim_mode is enabled by default because we've be able to assume to
>>> large distance mean large server until recently.
>>>
>>> Unfrotunately, recent modern x86 CPU (e.g. Core i7, Opeteron) have P2P
>>transport
>>> memory controller. IOW it's NUMA from software view.
>>>
>>> Some Core i7 machine has large remote node distance and zone_reclaim don't
>>> fit desktop and small file server. it cause performance degression.
>>
>>I can confirm this, Yanmin recently ran into exactly such a
>>regression, which was fixed by manually disabling the zone reclaim
>>mode. So I guess you can safely add an
[YM] Fengguang told the truth. One Nehalem machine has 12GB memory,
but there is always 2GB free although applications accesses lots of files.
Eventually we located the root cause as zone_reclaim_mode=1.
Acked.
>>
>>Tested-by: "Zhang, Yanmin" <yanmin.zhang@...el.com>
>>
>>> Thus, zone_reclaim == 0 is better by default. sorry, HPC gusy.
>>> you need to turn zone_reclaim_mode on manually now.
>>
>>I guess the borderline will continue to blur up. It will be more
>>dependent on workloads instead of physical NUMA capabilities. So
>>
>>Acked-by: Wu Fengguang <fengguang.wu@...el.com>
>>
>>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
>>> Cc: Christoph Lameter <cl@...ux-foundation.org>
>>> Cc: Rik van Riel <riel@...hat.com>
>>> ---
>>> mm/page_alloc.c | 7 -------
>>> 1 file changed, 7 deletions(-)
>>>
>>> Index: b/mm/page_alloc.c
>>> ===================================================================
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -2494,13 +2494,6 @@ static void build_zonelists(pg_data_t *p
>>> int distance = node_distance(local_node, node);
>>>
>>> /*
>>> - * If another node is sufficiently far away then it is better
>>> - * to reclaim pages in a zone before going off node.
>>> - */
>>> - if (distance > RECLAIM_DISTANCE)
>>> - zone_reclaim_mode = 1;
>>> -
>>> - /*
>>> * We don't want to pressure a particular node.
>>> * So adding penalty to the first node in same
>>> * distance group to make it round-robin.
>>>
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@...ck.org. For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
Powered by blists - more mailing lists