[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0703201248530.12664@graphe.net>
Date: Tue, 20 Mar 2007 12:54:45 -0700 (PDT)
From: Christoph Lameter <christoph@...eter.com>
To: Eric Dumazet <dada1@...mosbay.com>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Andi Kleen <andi@...stfloor.org>,
linux kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of
virt_to_slab(objp); nodeid = slabp->nodeid;
On Tue, 20 Mar 2007, Eric Dumazet wrote:
>
> I noticed on a small x86_64 NUMA setup (2 nodes) that cache_free_alien() is very expensive.
> This is because of a cache miss on struct slab.
> At the time an object is freed (call to kmem_cache_free() for example), the underlying 'struct slab' is not anymore cache-hot.
>
> struct slab *slabp = virt_to_slab(objp);
> nodeid = slabp->nodeid; // cache miss
>
> So we currently need slab only to lookup nodeid, to be able to use the cachep cpu cache, or not.
>
> Couldn't we use something less expensive, like pfn_to_nid() ?
Nodeid describes the node that the slab was allocated for which may
not be equal to the node that the page came from. But if GFP_THISNODE is
used then this should always be the same. That those two nodeid are
different was certainly frequent before the GFP_THISNODE work went in. Now
it may just occur in corner cases. Perhaps during bootup on
some machines that boot with empty nodes. I vaguely recall a powerpc
issue.
> Is it possible virt_to_slab(objp)->nodeid being different from pfn_to_nid(objp) ?
It is possible the page allocator falls back to another node than
requested. We would need to check that this never occurs.
If we are sure then we could drop the nodeid field completely.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists