[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Nov 2009 12:02:06 +0900 (JST)
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: David Rientjes <rientjes@...gle.com>
Cc: kosaki.motohiro@...fujitsu.com,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...ux-foundation.org>
Subject: Re: [BUGFIX][PATCH] oom-kill: fix NUMA consraint check with nodemask v3
Hi
> On Wed, 11 Nov 2009, KAMEZAWA Hiroyuki wrote:
>
> > Index: mm-test-kernel/drivers/char/sysrq.c
> > ===================================================================
> > --- mm-test-kernel.orig/drivers/char/sysrq.c
> > +++ mm-test-kernel/drivers/char/sysrq.c
> > @@ -339,7 +339,7 @@ static struct sysrq_key_op sysrq_term_op
> >
> > static void moom_callback(struct work_struct *ignored)
> > {
> > - out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0);
> > + out_of_memory(node_zonelist(0, GFP_KERNEL), GFP_KERNEL, 0, NULL);
> > }
> >
> > static DECLARE_WORK(moom_work, moom_callback);
> > Index: mm-test-kernel/mm/oom_kill.c
> > ===================================================================
> > --- mm-test-kernel.orig/mm/oom_kill.c
> > +++ mm-test-kernel/mm/oom_kill.c
> > @@ -196,27 +196,45 @@ unsigned long badness(struct task_struct
> > /*
> > * Determine the type of allocation constraint.
> > */
> > +#ifdef CONFIG_NUMA
> > static inline enum oom_constraint constrained_alloc(struct zonelist *zonelist,
> > - gfp_t gfp_mask)
> > + gfp_t gfp_mask, nodemask_t *nodemask)
>
> We should probably remove the inline specifier, there's only one caller
> currently and if additional ones were added in the future this function
> should probably not be replicated.
Good catch.
> > {
> > -#ifdef CONFIG_NUMA
> > struct zone *zone;
> > struct zoneref *z;
> > enum zone_type high_zoneidx = gfp_zone(gfp_mask);
> > - nodemask_t nodes = node_states[N_HIGH_MEMORY];
> > + int ret = CONSTRAINT_NONE;
> >
> > - for_each_zone_zonelist(zone, z, zonelist, high_zoneidx)
> > - if (cpuset_zone_allowed_softwall(zone, gfp_mask))
> > - node_clear(zone_to_nid(zone), nodes);
> > - else
> > + /*
> > + * The nodemask here is a nodemask passed to alloc_pages(). Now,
> > + * cpuset doesn't use this nodemask for its hardwall/softwall/hierarchy
> > + * feature. mempolicy is an only user of nodemask here.
> > + */
> > + if (nodemask) {
> > + nodemask_t mask;
> > + /* check mempolicy's nodemask contains all N_HIGH_MEMORY */
> > + nodes_and(mask, *nodemask, node_states[N_HIGH_MEMORY]);
> > + if (!nodes_equal(mask, node_states[N_HIGH_MEMORY]))
> > + return CONSTRAINT_MEMORY_POLICY;
> > + }
>
> Although a nodemask_t was previously allocated on the stack, we should
> probably change this to use NODEMASK_ALLOC() for kernels with higher
> CONFIG_NODES_SHIFT since allocations can happen very deep into the stack.
No. NODEMASK_ALLOC() is crap. we should remove it.
btw, CPUMASK_ALLOC was already removed.
> There should be a way around that, however. Shouldn't
>
> if (nodes_subset(node_states[N_HIGH_MEMORY], *nodemask))
> return CONSTRAINT_MEMORY_POLICY;
>
> be sufficient?
Is this safe on memory hotplug case?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists