[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <321b8b3e-9d06-b01c-d871-1f7ca35ce91e@suse.cz>
Date: Tue, 2 Aug 2022 11:32:41 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc: Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Joe Perches <joe@...ches.com>,
Vasily Averin <vasily.averin@...ux.dev>,
Matthew WilCox <willy@...radead.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 08/15] mm/slab_common: kmalloc_node: pass large
requests to page allocator
On 8/2/22 10:59, Hyeonggon Yoo wrote:
> On Mon, Aug 01, 2022 at 04:44:22PM +0200, Vlastimil Babka wrote:
>>
>
> Yeah, uninlining __kmalloc_large_node saves hundreds of bytes.
> And the diff below looks good to me.
>
> By The Way, do you have opinions on inlining slab_alloc_node()?
> (Looks like similar topic?)
>
> AFAIK slab_alloc_node() is inlined in:
> kmem_cache_alloc()
> kmem_cache_alloc_node()
> kmem_cache_alloc_lru()
> kmem_cache_alloc_trace()
> kmem_cache_alloc_node_trace()
> __kmem_cache_alloc_node()
>
> This is what I get after simply dropping __always_inline in slab_alloc_node:
>
> add/remove: 1/1 grow/shrink: 3/6 up/down: 1911/-5275 (-3364)
> Function old new delta
> slab_alloc_node - 1356 +1356
> sysfs_slab_alias 134 327 +193
> slab_memory_callback 528 717 +189
> __kmem_cache_create 1325 1498 +173
> __slab_alloc.constprop 135 - -135
> kmem_cache_alloc_trace 909 196 -713
> kmem_cache_alloc 937 191 -746
> kmem_cache_alloc_node_trace 1020 200 -820
> __kmem_cache_alloc_node 862 19 -843
> kmem_cache_alloc_node 1046 189 -857
> kmem_cache_alloc_lru 1348 187 -1161
> Total: Before=32011183, After=32007819, chg -0.01%
>
> So 3.28kB is cost of eliminating function call overhead in the
> fastpath.
>
> This is tradeoff between function call overhead and
> instruction cache usage...
We can investigate this aftewards, with proper measurements etc. I think
it's more sensitive than kmalloc_large_node.
Powered by blists - more mailing lists