[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lkozkbcucokzaicygwn7ym2cmmdt6bwyrluxb7ka7ygnrgyyfh@ktvirhq3hrtn>
Date: Wed, 14 Feb 2024 10:13:09 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Matthew Wilcox <willy@...radead.org>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
David Hildenbrand <david@...hat.com>, Michal Hocko <mhocko@...e.com>, akpm@...ux-foundation.org,
vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
dave@...olabs.net, liam.howlett@...cle.com, corbet@....net, void@...ifault.com,
peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org,
arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com, axboe@...nel.dk,
mcgrof@...nel.org, masahiroy@...nel.org, nathan@...nel.org, dennis@...nel.org,
tj@...nel.org, muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com,
hughd@...gle.com, andreyknvl@...il.com, keescook@...omium.org,
ndesaulniers@...gle.com, vvvvvv@...gle.com, gregkh@...uxfoundation.org,
ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, bristot@...hat.com,
vschneid@...hat.com, cl@...ux.com, penberg@...nel.org, iamjoonsoo.kim@....com,
42.hyeyoo@...il.com, glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
shakeelb@...gle.com, songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v3 00/35] Memory allocation profiling
On Wed, Feb 14, 2024 at 03:00:00PM +0000, Matthew Wilcox wrote:
> On Tue, Feb 13, 2024 at 06:08:45PM -0500, Kent Overstreet wrote:
> > This is what instrumenting an allocation function looks like:
> >
> > #define krealloc_array(...) alloc_hooks(krealloc_array_noprof(__VA_ARGS__))
> >
> > IOW, we have to:
> > - rename krealloc_array to krealloc_array_noprof
> > - replace krealloc_array with a one wrapper macro call
> >
> > Is this really all we're getting worked up over?
> >
> > The renaming we need regardless, because the thing that makes this
> > approach efficient enough to run in production is that we account at
> > _one_ point in the callstack, we don't save entire backtraces.
>
> I'm probably going to regret getting involved in this thread, but since
> Suren already decided to put me on the cc ...
>
> There might be a way to do it without renaming. We have a bit of the
> linker script called SCHED_TEXT which lets us implement
> in_sched_functions(). ie we could have the equivalent of
>
> include/linux/sched/debug.h:#define __sched __section(".sched.text")
>
> perhaps #define __memalloc __section(".memalloc.text")
> which would do all the necessary magic to know where the backtrace
> should stop.
Could we please try to get through the cover letter before proposing
alternatives? I already explained there why we need the renaming.
In addition, you can't create the per-callsite codetag with linker
magic; you nede the macro for that.
Instead of citing myself again, I'm just going to post what I was
working on last night for the documentation directory:
. SPDX-License-Identifier: GPL-2.0
===========================
MEMORY ALLOCATION PROFILING
===========================
Low overhead (suitable for production) accounting of all memory allocations,
tracked by file and line number.
Usage:
kconfig options:
- CONFIG_MEM_ALLOC_PROFILING
- CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
- CONFIG_MEM_ALLOC_PROFILING_DEBUG
adds warnings for allocations that weren't accounted because of a
missing annotation
sysctl:
/proc/sys/vm/mem_profiling
Runtime info:
/proc/allocinfo
Example output:
root@...ia-kvm:~# sort -h /proc/allocinfo|tail
3.11MiB 2850 fs/ext4/super.c:1408 module:ext4 func:ext4_alloc_inode
3.52MiB 225 kernel/fork.c:356 module:fork func:alloc_thread_stack_node
3.75MiB 960 mm/page_ext.c:270 module:page_ext func:alloc_page_ext
4.00MiB 2 mm/khugepaged.c:893 module:khugepaged func:hpage_collapse_alloc_folio
10.5MiB 168 block/blk-mq.c:3421 module:blk_mq func:blk_mq_alloc_rqs
14.0MiB 3594 include/linux/gfp.h:295 module:filemap func:folio_alloc_noprof
26.8MiB 6856 include/linux/gfp.h:295 module:memory func:folio_alloc_noprof
64.5MiB 98315 fs/xfs/xfs_rmap_item.c:147 module:xfs func:xfs_rui_init
98.7MiB 25264 include/linux/gfp.h:295 module:readahead func:folio_alloc_noprof
125MiB 7357 mm/slub.c:2201 module:slub func:alloc_slab_page
Theory of operation:
Memory allocation profiling builds off of code tagging, which is a library for
declaring static structs (that typcially describe a file and line number in
some way, hence code tagging) and then finding and operating on them at runtime
- i.e. iterating over them to print them in debugfs/procfs.
To add accounting for an allocation call, we replace it with a macro
invocation, alloc_hooks(), that
- declares a code tag
- stashes a pointer to it in task_struct
- calls the real allocation function
- and finally, restores the task_struct alloc tag pointer to its previous value.
This allows for alloc_hooks() calls to be nested, with the most recent one
taking effect. This is important for allocations internal to the mm/ code that
do not properly belong to the outer allocation context and should be counted
separately: for example, slab object extension vectors, or when the slab
allocates pages from the page allocator.
Thus, proper usage requires determining which function in an allocation call
stack should be tagged. There are many helper functions that essentially wrap
e.g. kmalloc() and do a little more work, then are called in multiple places;
we'll generally want the accounting to happen in the callers of these helpers,
not in the helpers themselves.
To fix up a given helper, for example foo(), do the following:
- switch its allocation call to the _noprof() version, e.g. kmalloc_noprof()
- rename it to foo_noprof()
- define a macro version of foo() like so:
#define foo(...) alloc_hooks(foo_noprof(__VA_ARGS__))
Powered by blists - more mailing lists