[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50937943.2040302@cn.fujitsu.com>
Date: Fri, 02 Nov 2012 15:41:55 +0800
From: Wen Congyang <wency@...fujitsu.com>
To: David Rientjes <rientjes@...gle.com>
CC: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-doc@...r.kernel.org, Rob Landley <rob@...dley.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Jiang Liu <jiang.liu@...wei.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Minchan Kim <minchan.kim@...il.com>,
Mel Gorman <mgorman@...e.de>, Yinghai Lu <yinghai@...nel.org>,
"rusty@...tcorp.com.au" <rusty@...tcorp.com.au>
Subject: Re: [PART3 Patch 00/14] introduce N_MEMORY
At 11/02/2012 05:36 AM, David Rientjes Wrote:
> On Thu, 1 Nov 2012, Wen Congyang wrote:
>
>>> This doesn't describe why we need the new node state, unfortunately. It
>>
>> 1. Somethimes, we use the node which contains the memory that can be used by
>> kernel.
>> 2. Sometimes, we use the node which contains the memory.
>>
>> In case1, we use N_HIGH_MEMORY, and we use N_MEMORY in case2.
>>
>
> Yeah, that's clear, but the question is still _why_ we want two different
> nodemasks. I know that this part of the patchset simply introduces the
> new nodemask because the name "N_MEMORY" is more clear than
> "N_HIGH_MEMORY", but there's no real incentive for making that change by
> introducing a new nodemask where a simple rename would suffice.
>
> I can only assume that you want to later use one of them for a different
> purpose: those that do not include nodes that consist of only
> ZONE_MOVABLE. But that change for MPOL_BIND is nacked since it
> significantly changes the semantics of set_mempolicy() and you can't break
> userspace (see my response to that from yesterday). Until that problem is
> addressed, then there's no reason for the additional nodemask so nack on
> this series as well.
>
I still think that we need two nodemasks: one store the node which has memory
that the kernel can use, and one store the node which has memory.
For example:
==========================
static void *__meminit alloc_page_cgroup(size_t size, int nid)
{
gfp_t flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN;
void *addr = NULL;
addr = alloc_pages_exact_nid(nid, size, flags);
if (addr) {
kmemleak_alloc(addr, size, 1, flags);
return addr;
}
if (node_state(nid, N_HIGH_MEMORY))
addr = vzalloc_node(size, nid);
else
addr = vzalloc(size);
return addr;
}
==========================
If the node only has ZONE_MOVABLE memory, we should use vzalloc().
So we should have a mask that stores the node which has memory that
the kernel can use.
==========================
static int mpol_set_nodemask(struct mempolicy *pol,
const nodemask_t *nodes, struct nodemask_scratch *nsc)
{
int ret;
/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
if (pol == NULL)
return 0;
/* Check N_HIGH_MEMORY */
nodes_and(nsc->mask1,
cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
...
if (pol->flags & MPOL_F_RELATIVE_NODES)
mpol_relative_nodemask(&nsc->mask2, nodes,&nsc->mask1);
else
nodes_and(nsc->mask2, *nodes, nsc->mask1);
...
}
==========================
If the user specifies 2 nodes: one has ZONE_MOVABLE memory, and the other one doesn't.
nsc->mask2 should contain these 2 nodes. So we should hava a mask that store the node
which has memory.
There maybe something wrong in the change for MPOL_BIND. But this patchset is needed.
Thanks
Wen Congyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists