[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Nov 2009 12:23:43 +0900 (JST)
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: David Rientjes <rientjes@...gle.com>
Cc: kosaki.motohiro@...fujitsu.com,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...ux-foundation.org>
Subject: Re: [BUGFIX][PATCH] oom-kill: fix NUMA consraint check with nodemask v3
> On Wed, 11 Nov 2009, KOSAKI Motohiro wrote:
>
> > > > {
> > > > -#ifdef CONFIG_NUMA
> > > > struct zone *zone;
> > > > struct zoneref *z;
> > > > enum zone_type high_zoneidx = gfp_zone(gfp_mask);
> > > > - nodemask_t nodes = node_states[N_HIGH_MEMORY];
> > > > + int ret = CONSTRAINT_NONE;
> > > >
> > > > - for_each_zone_zonelist(zone, z, zonelist, high_zoneidx)
> > > > - if (cpuset_zone_allowed_softwall(zone, gfp_mask))
> > > > - node_clear(zone_to_nid(zone), nodes);
> > > > - else
> > > > + /*
> > > > + * The nodemask here is a nodemask passed to alloc_pages(). Now,
> > > > + * cpuset doesn't use this nodemask for its hardwall/softwall/hierarchy
> > > > + * feature. mempolicy is an only user of nodemask here.
> > > > + */
> > > > + if (nodemask) {
> > > > + nodemask_t mask;
> > > > + /* check mempolicy's nodemask contains all N_HIGH_MEMORY */
> > > > + nodes_and(mask, *nodemask, node_states[N_HIGH_MEMORY]);
> > > > + if (!nodes_equal(mask, node_states[N_HIGH_MEMORY]))
> > > > + return CONSTRAINT_MEMORY_POLICY;
> > > > + }
> > >
> > > Although a nodemask_t was previously allocated on the stack, we should
> > > probably change this to use NODEMASK_ALLOC() for kernels with higher
> > > CONFIG_NODES_SHIFT since allocations can happen very deep into the stack.
> >
> > No. NODEMASK_ALLOC() is crap. we should remove it.
>
> I've booted 1K node systems and have found it to be helpful to ensure that
> the stack will not overflow especially in areas where we normally are deep
> already, such as in the page allocator.
Linux doesn't support 1K nodes. (and only SGI huge machine use 512 nodes)
At least, NODEMASK_ALLOC should make more cleaner interface. current one
and struct nodemask_scratch are pretty ugly.
> > btw, CPUMASK_ALLOC was already removed.
>
> I don't remember CPUMASK_ALLOC() actually being merged. I know the
> comment exists in nodemask.h, but I don't recall any CPUMASK_ALLOC() users
> in the tree.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists