lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpGemg-aXyiK1fHavdKuW+-9+DM5_4krLAdg+DQh=24Dvg@mail.gmail.com>
Date: Sun, 18 Feb 2024 02:21:18 +0000
From: Suren Baghdasaryan <surenb@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: akpm@...ux-foundation.org, kent.overstreet@...ux.dev, mhocko@...e.com, 
	hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de, 
	dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com, 
	corbet@....net, void@...ifault.com, peterz@...radead.org, 
	juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org, 
	arnd@...db.de, tglx@...utronix.de, mingo@...hat.com, 
	dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com, 
	david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org, 
	nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev, 
	rppt@...nel.org, paulmck@...nel.org, pasha.tatashin@...een.com, 
	yosryahmed@...gle.com, yuzhao@...gle.com, dhowells@...hat.com, 
	hughd@...gle.com, andreyknvl@...il.com, keescook@...omium.org, 
	ndesaulniers@...gle.com, vvvvvv@...gle.com, gregkh@...uxfoundation.org, 
	ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org, 
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, 
	bristot@...hat.com, vschneid@...hat.com, cl@...ux.com, penberg@...nel.org, 
	iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, glider@...gle.com, 
	elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com, 
	songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com, 
	minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com, 
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
	iommu@...ts.linux.dev, linux-arch@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, 
	linux-modules@...r.kernel.org, kasan-dev@...glegroups.com, 
	cgroups@...r.kernel.org
Subject: Re: [PATCH v3 13/35] lib: add allocation tagging support for memory
 allocation profiling

On Fri, Feb 16, 2024 at 8:57 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 2/12/24 22:38, Suren Baghdasaryan wrote:
> > Introduce CONFIG_MEM_ALLOC_PROFILING which provides definitions to easily
> > instrument memory allocators. It registers an "alloc_tags" codetag type
> > with /proc/allocinfo interface to output allocation tag information when
> > the feature is enabled.
> > CONFIG_MEM_ALLOC_PROFILING_DEBUG is provided for debugging the memory
> > allocation profiling instrumentation.
> > Memory allocation profiling can be enabled or disabled at runtime using
> > /proc/sys/vm/mem_profiling sysctl when CONFIG_MEM_ALLOC_PROFILING_DEBUG=n.
> > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT enables memory allocation
> > profiling by default.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> > Co-developed-by: Kent Overstreet <kent.overstreet@...ux.dev>
> > Signed-off-by: Kent Overstreet <kent.overstreet@...ux.dev>
> > ---
> >  Documentation/admin-guide/sysctl/vm.rst |  16 +++
> >  Documentation/filesystems/proc.rst      |  28 +++++
> >  include/asm-generic/codetag.lds.h       |  14 +++
> >  include/asm-generic/vmlinux.lds.h       |   3 +
> >  include/linux/alloc_tag.h               | 133 ++++++++++++++++++++
> >  include/linux/sched.h                   |  24 ++++
> >  lib/Kconfig.debug                       |  25 ++++
> >  lib/Makefile                            |   2 +
> >  lib/alloc_tag.c                         | 158 ++++++++++++++++++++++++
> >  scripts/module.lds.S                    |   7 ++
> >  10 files changed, 410 insertions(+)
> >  create mode 100644 include/asm-generic/codetag.lds.h
> >  create mode 100644 include/linux/alloc_tag.h
> >  create mode 100644 lib/alloc_tag.c
> >
> > diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> > index c59889de122b..a214719492ea 100644
> > --- a/Documentation/admin-guide/sysctl/vm.rst
> > +++ b/Documentation/admin-guide/sysctl/vm.rst
> > @@ -43,6 +43,7 @@ Currently, these files are in /proc/sys/vm:
> >  - legacy_va_layout
> >  - lowmem_reserve_ratio
> >  - max_map_count
> > +- mem_profiling         (only if CONFIG_MEM_ALLOC_PROFILING=y)
> >  - memory_failure_early_kill
> >  - memory_failure_recovery
> >  - min_free_kbytes
> > @@ -425,6 +426,21 @@ e.g., up to one or two maps per allocation.
> >  The default value is 65530.
> >
> >
> > +mem_profiling
> > +==============
> > +
> > +Enable memory profiling (when CONFIG_MEM_ALLOC_PROFILING=y)
> > +
> > +1: Enable memory profiling.
> > +
> > +0: Disabld memory profiling.
>
>       Disable

Ack.

>
> ...
>
> > +allocinfo
> > +~~~~~~~
> > +
> > +Provides information about memory allocations at all locations in the code
> > +base. Each allocation in the code is identified by its source file, line
> > +number, module and the function calling the allocation. The number of bytes
> > +allocated at each location is reported.
>
> See, it even says "number of bytes" :)

Yes, we are changing the output to bytes.

>
> > +
> > +Example output.
> > +
> > +::
> > +
> > +    > cat /proc/allocinfo
> > +
> > +      153MiB     mm/slub.c:1826 module:slub func:alloc_slab_page
>
> Is "module" meant in the usual kernel module sense? In that case IIRC is
> more common to annotate things e.g. [xfs] in case it's really a module, and
> nothing if it's built it, such as slub. Is that "slub" simply derived from
> "mm/slub.c"? Then it's just redundant?

Sounds good. The new example would look like this:

    > sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567
func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod]
func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Note that [ctagmod] is the only allocation from a module in this example.

>
> > +     6.08MiB     mm/slab_common.c:950 module:slab_common func:_kmalloc_order
> > +     5.09MiB     mm/memcontrol.c:2814 module:memcontrol func:alloc_slab_obj_exts
> > +     4.54MiB     mm/page_alloc.c:5777 module:page_alloc func:alloc_pages_exact
> > +     1.32MiB     include/asm-generic/pgalloc.h:63 module:pgtable func:__pte_alloc_one
> > +     1.16MiB     fs/xfs/xfs_log_priv.h:700 module:xfs func:xlog_kvmalloc
> > +     1.00MiB     mm/swap_cgroup.c:48 module:swap_cgroup func:swap_cgroup_prepare
> > +      734KiB     fs/xfs/kmem.c:20 module:xfs func:kmem_alloc
> > +      640KiB     kernel/rcu/tree.c:3184 module:tree func:fill_page_cache_func
> > +      640KiB     drivers/char/virtio_console.c:452 module:virtio_console func:alloc_buf
> > +      ...
> > +
> > +
> >  meminfo
>
> ...
>
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 0be2d00c3696..78d258ca508f 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -972,6 +972,31 @@ config CODE_TAGGING
> >       bool
> >       select KALLSYMS
> >
> > +config MEM_ALLOC_PROFILING
> > +     bool "Enable memory allocation profiling"
> > +     default n
> > +     depends on PROC_FS
> > +     depends on !DEBUG_FORCE_WEAK_PER_CPU
> > +     select CODE_TAGGING
> > +     help
> > +       Track allocation source code and record total allocation size
> > +       initiated at that code location. The mechanism can be used to track
> > +       memory leaks with a low performance and memory impact.
> > +
> > +config MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
> > +     bool "Enable memory allocation profiling by default"
> > +     default y
>
> I'd go with default n as that I'd select for a general distro.

Well, we have MEM_ALLOC_PROFILING=n by default, so if it was switched
on manually, that is a strong sign that the user wants it enabled IMO.
So, enabling this switch by default seems logical to me. If a distro
wants to have the feature compiled in but disabled by default then
this is perfectly doable, just need to set both options appropriately.
Does my logic make sense?

>
> > +     depends on MEM_ALLOC_PROFILING
> > +
> > +config MEM_ALLOC_PROFILING_DEBUG
> > +     bool "Memory allocation profiler debugging"
> > +     default n
> > +     depends on MEM_ALLOC_PROFILING
> > +     select MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
> > +     help
> > +       Adds warnings with helpful error messages for memory allocation
> > +       profiling.
> > +
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ