[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c14cd89b-c879-4474-a800-d60fc29c1820@gmail.com>
Date: Fri, 5 Apr 2024 15:37:12 +0200
From: Klara Modin <klarasmodin@...il.com>
To: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org
Cc: kent.overstreet@...ux.dev, mhocko@...e.com, vbabka@...e.cz,
hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com,
penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, jhubbard@...dia.com, tj@...nel.org,
muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com,
dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com,
gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, aliceryhl@...gle.com,
rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
kernel-team@...roid.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-modules@...r.kernel.org,
kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v6 00/37] Memory allocation profiling
Hi,
On 2024-03-21 17:36, Suren Baghdasaryan wrote:
> Overview:
> Low overhead [1] per-callsite memory allocation profiling. Not just for
> debug kernels, overhead low enough to be deployed in production.
>
> Example output:
> root@...ia-kvm:~# sort -rn /proc/allocinfo
> 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext
> 56373248 4737 mm/slub.c:2259 func:alloc_slab_page
> 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded
> 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
> 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
> 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio
> 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node
> 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
> 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
> 3940352 962 mm/memory.c:4214 func:alloc_anon_folio
> 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
> ...
>
> Since v5 [2]:
> - Added Reviewed-by and Acked-by, per Vlastimil Babka and Miguel Ojeda
> - Changed pgalloc_tag_{add|sub} to use number of pages instead of order, per Matthew Wilcox
> - Changed pgalloc_tag_sub_bytes to pgalloc_tag_sub_pages and adjusted the usage, per Matthew Wilcox
> - Moved static key check before prepare_slab_obj_exts_hook(), per Vlastimil Babka
> - Fixed RUST helper, per Miguel Ojeda
> - Fixed documentation, per Randy Dunlap
> - Rebased over mm-unstable
>
> Usage:
> kconfig options:
> - CONFIG_MEM_ALLOC_PROFILING
> - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
> - CONFIG_MEM_ALLOC_PROFILING_DEBUG
> adds warnings for allocations that weren't accounted because of a
> missing annotation
>
> sysctl:
> /proc/sys/vm/mem_profiling
>
> Runtime info:
> /proc/allocinfo
>
> Notes:
>
> [1]: Overhead
> To measure the overhead we are comparing the following configurations:
> (1) Baseline with CONFIG_MEMCG_KMEM=n
> (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
> CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
> (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
> CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
> (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
> CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
> (5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
> (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
> CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) && CONFIG_MEMCG_KMEM=y
> (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
> CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y
>
> Performance overhead:
> To evaluate performance we implemented an in-kernel test executing
> multiple get_free_page/free_page and kmalloc/kfree calls with allocation
> sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
> affinity set to a specific CPU to minimize the noise. Below are results
> from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
> 56 core Intel Xeon:
>
> kmalloc pgalloc
> (1 baseline) 6.764s 16.902s
> (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%)
> (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%)
> (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%)
> (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%)
> (6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%)
> (7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%)
>
> Memory overhead:
> Kernel size:
>
> text data bss dec diff
> (1) 26515311 18890222 17018880 62424413
> (2) 26524728 19423818 16740352 62688898 264485
> (3) 26524724 19423818 16740352 62688894 264481
> (4) 26524728 19423818 16740352 62688898 264485
> (5) 26541782 18964374 16957440 62463596 39183
>
> Memory consumption on a 56 core Intel CPU with 125GB of memory:
> Code tags: 192 kB
> PageExts: 262144 kB (256MB)
> SlabExts: 9876 kB (9.6MB)
> PcpuExts: 512 kB (0.5MB)
>
> Total overhead is 0.2% of total memory.
>
> Benchmarks:
>
> Hackbench tests run 100 times:
> hackbench -s 512 -l 200 -g 15 -f 25 -P
> baseline disabled profiling enabled profiling
> avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023)
> stdev 0.0137 0.0188 0.0077
>
>
> hackbench -l 10000
> baseline disabled profiling enabled profiling
> avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859)
> stdev 0.0933 0.0286 0.0489
>
> stress-ng tests:
> stress-ng --class memory --seq 4 -t 60
> stress-ng --class cpu --seq 4 -t 60
> Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/
>
> [2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/
If I enable this, I consistently get percpu allocation failures. I can
occasionally reproduce it in qemu. I've attached the logs and my config,
please let me know if there's anything else that could be relevant.
Kind regards,
Klara Modin
Download attachment "debug_alloc_profiling.log.gz" of type "application/gzip" (28378 bytes)
Download attachment "config.gz" of type "application/gzip" (38465 bytes)
Download attachment "qemu-alloc3.log.gz" of type "application/gzip" (14651 bytes)
Powered by blists - more mailing lists