[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67453a56-d4c2-4dc8-a5db-0a4665e40856@suse.cz>
Date: Tue, 27 Feb 2024 14:36:14 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org
Cc: kent.overstreet@...ux.dev, mhocko@...e.com, hannes@...xchg.org,
roman.gushchin@...ux.dev, mgorman@...e.de, dave@...olabs.net,
willy@...radead.org, liam.howlett@...cle.com,
penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev,
rppt@...nel.org, paulmck@...nel.org, pasha.tatashin@...een.com,
yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com,
hughd@...gle.com, andreyknvl@...il.com, keescook@...omium.org,
ndesaulniers@...gle.com, vvvvvv@...gle.com, gregkh@...uxfoundation.org,
ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
bristot@...hat.com, vschneid@...hat.com, cl@...ux.com, penberg@...nel.org,
iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com,
elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v4 00/36] Memory allocation profiling
On 2/21/24 20:40, Suren Baghdasaryan wrote:
> Overview:
> Low overhead [1] per-callsite memory allocation profiling. Not just for
> debug kernels, overhead low enough to be deployed in production.
>
> Example output:
> root@...ia-kvm:~# sort -rn /proc/allocinfo
> 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext
> 56373248 4737 mm/slub.c:2259 func:alloc_slab_page
> 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded
> 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash
> 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
> 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio
> 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node
> 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
> 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
> 3940352 962 mm/memory.c:4214 func:alloc_anon_folio
> 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
> ...
>
> Since v3:
> - Dropped patch changing string_get_size() [2] as not needed
> - Dropped patch modifying xfs allocators [3] as non needed,
> per Dave Chinner
> - Added Reviewed-by, per Kees Cook
> - Moved prepare_slab_obj_exts_hook() and alloc_slab_obj_exts() where they
> are used, per Vlastimil Babka
> - Fixed SLAB_NO_OBJ_EXT definition to use unused bit, per Vlastimil Babka
> - Refactored patch [4] into other patches, per Vlastimil Babka
> - Replaced snprintf() with seq_buf_printf(), per Kees Cook
> - Changed output to report bytes, per Andrew Morton and Pasha Tatashin
> - Changed output to report [module] only for loadable modules,
> per Vlastimil Babka
> - Moved mem_alloc_profiling_enabled() check earlier, per Vlastimil Babka
> - Changed the code to handle page splitting to be more understandable,
> per Vlastimil Babka
> - Moved alloc_tagging_slab_free_hook(), mark_objexts_empty(),
> mark_failed_objexts_alloc() and handle_failed_objexts_alloc(),
> per Vlastimil Babka
> - Fixed loss of __alloc_size(1, 2) in kvmalloc functions,
> per Vlastimil Babka
> - Refactored the code in show_mem() to avoid memory allocations,
> per Michal Hocko
> - Changed to trylock in show_mem() to avoid blocking in atomic context,
> per Tetsuo Handa
> - Added mm mailing list into MAINTAINERS, per Kees Cook
> - Added base commit SHA, per Andy Shevchenko
> - Added a patch with documentation, per Jani Nikula
> - Fixed 0day bugs
> - Added benchmark results [5], per Steven Rostedt
> - Rebased over Linux 6.8-rc5
>
> Items not yet addressed:
> - An early_boot option to prevent pageext overhead. We are looking into
> ways for using the same sysctr instead of adding additional early boot
> parameter.
I have reviewed the parts that integrate the tracking with page and slab
allocators, and besides some details to improve it seems ok to me. The
early boot option seems coming so that might eventually be suitable for
build-time enablement in a distro kernel.
The macros (and their potential spread to upper layers to keep the
information useful enough) are of course ugly, but guess it can't be
currently helped and I'm unable to decide whether it's worth it or not.
That's up to those providing their success stories I guess. If there's
at least a path ahead to replace that part with compiler support in the
future, great. So I'm not against merging this. BTW, do we know Linus's
opinion on the macros approach?
Powered by blists - more mailing lists