[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1002251627040.18861@router.home>
Date: Thu, 25 Feb 2010 16:31:00 -0600 (CST)
From: Christoph Lameter <cl@...ux-foundation.org>
To: David Rientjes <rientjes@...gle.com>
cc: Pekka Enberg <penberg@...helsinki.fi>,
Andi Kleen <andi@...stfloor.org>,
Nick Piggin <npiggin@...e.de>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, haicheng.li@...el.com,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH] [4/4] SLAB: Fix node add timer race in cache_reap
On Thu, 25 Feb 2010, David Rientjes wrote:
> On Thu, 25 Feb 2010, Christoph Lameter wrote:
>
> > > I don't see how memory hotadd with a new node being onlined could have
> > > worked fine before since slab lacked any memory hotplug notifier until
> > > Andi just added it.
> >
> > AFAICR The cpu notifier took on that role in the past.
> >
>
> The cpu notifier isn't involved if the firmware notifies the kernel that a
> new ACPI memory device has been added or you write a start address to
> /sys/devices/system/memory/probe. Hot-added memory devices can include
> ACPI_SRAT_MEM_HOT_PLUGGABLE entries in the SRAT for x86 that assign them
> non-online node ids (although all such entries get their bits set in
> node_possible_map at boot), so a new pgdat may be allocated for the node's
> registered range.
Yes Andi's work makes it explicit but there is already code in the cpu
notifier (see cpuup_prepare) that seems to have been intended to
initialize the node structures. Wonder why the hotplug people never
addressed that issue? Kame?
list_for_each_entry(cachep, &cache_chain, next) {
/*
* Set up the size64 kmemlist for cpu before we can
* begin anything. Make sure some other cpu on this
* node has not already allocated this
*/
if (!cachep->nodelists[node]) {
l3 = kmalloc_node(memsize, GFP_KERNEL, node);
if (!l3)
goto bad;
kmem_list3_init(l3);
l3->next_reap = jiffies + REAPTIMEOUT_LIST3 +
((unsigned long)cachep) % REAPTIMEOUT_LIST3;
/*
* The l3s don't come and go as CPUs come and
* go. cache_chain_mutex is sufficient
* protection here.
*/
cachep->nodelists[node] = l3;
}
spin_lock_irq(&cachep->nodelists[node]->list_lock);
cachep->nodelists[node]->free_limit =
(1 + nr_cpus_node(node)) *
cachep->batchcount + cachep->num;
spin_unlock_irq(&cachep->nodelists[node]->list_lock);
}
> kmalloc_node() in generic kernel code. All that is done under
> MEM_GOING_ONLINE and not MEM_ONLINE, which is why I suggest the first and
> fourth patch in this series may not be necessary if we prevent setting the
> bit in the nodemask or building the zonelists until the slab nodelists are
> ready.
That sounds good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists