linux-kernel - Re: [PATCH v3 08/15] mm/slab_common: kmalloc_node: pass large requests to page allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <321b8b3e-9d06-b01c-d871-1f7ca35ce91e@suse.cz>
Date:   Tue, 2 Aug 2022 11:32:41 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc:     Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Joe Perches <joe@...ches.com>,
        Vasily Averin <vasily.averin@...ux.dev>,
        Matthew WilCox <willy@...radead.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 08/15] mm/slab_common: kmalloc_node: pass large
 requests to page allocator

On 8/2/22 10:59, Hyeonggon Yoo wrote:
> On Mon, Aug 01, 2022 at 04:44:22PM +0200, Vlastimil Babka wrote:
>> 
> 
> Yeah, uninlining __kmalloc_large_node saves hundreds of bytes.
> And the diff below looks good to me.
> 
> By The Way, do you have opinions on inlining slab_alloc_node()?
> (Looks like similar topic?)
> 
> AFAIK slab_alloc_node() is inlined in:
>         kmem_cache_alloc()
>         kmem_cache_alloc_node()
>         kmem_cache_alloc_lru()
>         kmem_cache_alloc_trace()
>         kmem_cache_alloc_node_trace()
>         __kmem_cache_alloc_node()
> 
> This is what I get after simply dropping __always_inline in slab_alloc_node:
> 
> add/remove: 1/1 grow/shrink: 3/6 up/down: 1911/-5275 (-3364)
> Function                                     old     new   delta
> slab_alloc_node                                -    1356   +1356
> sysfs_slab_alias                             134     327    +193
> slab_memory_callback                         528     717    +189
> __kmem_cache_create                         1325    1498    +173
> __slab_alloc.constprop                       135       -    -135
> kmem_cache_alloc_trace                       909     196    -713
> kmem_cache_alloc                             937     191    -746
> kmem_cache_alloc_node_trace                 1020     200    -820
> __kmem_cache_alloc_node                      862      19    -843
> kmem_cache_alloc_node                       1046     189    -857
> kmem_cache_alloc_lru                        1348     187   -1161
> Total: Before=32011183, After=32007819, chg -0.01%
> 
> So 3.28kB is cost of eliminating function call overhead in the 
> fastpath.
> 
> This is tradeoff between function call overhead and
> instruction cache usage...

We can investigate this aftewards, with proper measurements etc. I think
it's more sensitive than kmalloc_large_node.