[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121130105247.GB8218@suse.de>
Date: Fri, 30 Nov 2012 10:52:47 +0000
From: Mel Gorman <mgorman@...e.de>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: "H. Peter Anvin" <hpa@...or.com>, Jiang Liu <jiang.liu@...wei.com>,
Tang Chen <tangchen@...fujitsu.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"rob@...dley.net" <rob@...dley.net>,
"isimatu.yasuaki@...fujitsu.com" <isimatu.yasuaki@...fujitsu.com>,
"laijs@...fujitsu.com" <laijs@...fujitsu.com>,
"wency@...fujitsu.com" <wency@...fujitsu.com>,
"linfeng@...fujitsu.com" <linfeng@...fujitsu.com>,
"yinghai@...nel.org" <yinghai@...nel.org>,
"kosaki.motohiro@...fujitsu.com" <kosaki.motohiro@...fujitsu.com>,
"minchan.kim@...il.com" <minchan.kim@...il.com>,
"rientjes@...gle.com" <rientjes@...gle.com>,
"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
Len Brown <lenb@...nel.org>,
"Wang, Frank" <frank.wang@...el.com>
Subject: Re: [PATCH v2 0/5] Add movablecore_map boot option
On Fri, Nov 30, 2012 at 02:58:40AM +0000, Luck, Tony wrote:
> > If any significant percentage of memory is in ZONE_MOVABLE then the memory
> > hotplug people will have to deal with all the lowmem/highmem problems
> > that used to be faced by 32-bit x86 with PAE enabled.
>
> While these problems may still exist on large systems - I think it becomes
> harder to construct workloads that run into problems. In those bad old days
> a significant fraction of lowmem was consumed by the kernel ... so it was
> pretty easy to find meta-data intensive workloads that would push it over
> a cliff. Here we are talking about systems with say 128GB per node divided
> into 64GB moveable and 64GB non-moveable (and I'd regard this as a rather
> low-end machine). Unless the workload consists of zillions of tiny processes
> all mapping shared memory blocks, the percentage of memory allocated to
> the kernel is going to be tiny compared with the old 4GB days.
>
Sure, if that's how the end-user decides to configure it. My concern is
what they'll do is configure node-0 to be ZONE_NORMAL and all other nodes
to be ZONE_MOVABLE -- 3 to 1 ratio "highmem" to "lowmem" effectively on
a 4-node machine or 7 to 1 on an 8-node. It'll be harder than it was in
the old days to trigger the problems but it'll still be possible and it
will generate bug reports down the road. Some will be obvious at least --
OOM killer triggered for GFP_KERNEL with plenty of free memory but all in
ZONE_MOVABLE. Others will be less obvious -- major stalls during IO tests
while ramping up with large amounts of reclaim activity visible even though
only 20-40% of memory is in use.
I'm not even getting into the impact this has on NUMA performance.
I'm not saying that ZONE_MOVABLE will not work. It will and it'll work
in the short-term but it's far from being a great long-term solution and
it is going to generate bug reports that will have to be supported by
distributions. Even if the interface to how it is configured gets ironed
out there still should be a replacement plan in place. FWIW, I dislike the
command-line configuration option. If it was me, I would have gone with
starting a machine with memory mostly off-lined and used sysfs files or
different sysfs strings written to the "online" file to determine if a
section was ZONE_MOVABLE or the next best alternative.
--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists