[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46005B7D.3090701@cosmosbay.com>
Date: Tue, 20 Mar 2007 23:09:01 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Andi Kleen <andi@...stfloor.org>
CC: Christoph Lameter <christoph@...eter.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux kernel <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] SLAB : NUMA cache_free_alien() very expensive because of
virt_to_slab(objp); nodeid = slabp->nodeid;
Andi Kleen a écrit :
>>> Is it possible virt_to_slab(objp)->nodeid being different from pfn_to_nid(objp) ?
>> It is possible the page allocator falls back to another node than
>> requested. We would need to check that this never occurs.
>
> The only way to ensure that would be to set a strict mempolicy.
> But I'm not sure that's a good idea -- after all you don't want
> to fail an allocation in this case.
>
> But pfn_to_nid on the object like proposed by Eric should work anyways.
> But I'm not sure the tables used for that will be more often cache hot
> than the slab.
pfn_to_nid() on most x86_64 machines access one cache line (struct memnode).
Node 0 MemBase 0000000000000000 Limit 0000000280000000
Node 1 MemBase 0000000280000000 Limit 0000000480000000
NUMA: Using 31 for the hash shift.
On this example, we use only 8 bytes of memnode.embedded_map[] to find nid of
all 16 GB of ram. On profiles I have, memnode is always hot (no cache miss on it).
While virt_to_slab() has to access :
1) struct page -> page_get_slab() (page->lru.prev) (one cache miss)
2) struct slab -> nodeid (one other cache miss)
So using pfn_to_nid() would avoid 2 cache misses.
I understand we want to do special things (fallback and such tricks) at
allocation time, but I believe that we can just trust the real nid of memory
at free time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists