lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Apr 2024 15:37:12 +0200
From: Klara Modin <klarasmodin@...il.com>
To: Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org
Cc: kent.overstreet@...ux.dev, mhocko@...e.com, vbabka@...e.cz,
 hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
 dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com,
 penguin-kernel@...ove.sakura.ne.jp, corbet@....net, void@...ifault.com,
 peterz@...radead.org, juri.lelli@...hat.com, catalin.marinas@....com,
 will@...nel.org, arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
 dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
 david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
 nathan@...nel.org, dennis@...nel.org, jhubbard@...dia.com, tj@...nel.org,
 muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
 pasha.tatashin@...een.com, yosryahmed@...gle.com, yuzhao@...gle.com,
 dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
 keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com,
 gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
 vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
 bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
 penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
 glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
 songmuchun@...edance.com, jbaron@...mai.com, aliceryhl@...gle.com,
 rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
 kernel-team@...roid.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
 linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-mm@...ck.org, linux-modules@...r.kernel.org,
 kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH v6 00/37] Memory allocation profiling

Hi,

On 2024-03-21 17:36, Suren Baghdasaryan wrote:
> Overview:
> Low overhead [1] per-callsite memory allocation profiling. Not just for
> debug kernels, overhead low enough to be deployed in production.
> 
> Example output:
>    root@...ia-kvm:~# sort -rn /proc/allocinfo
>     127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
>      56373248     4737 mm/slub.c:2259 func:alloc_slab_page
>      14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
>      14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
>      13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
>      11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
>       9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
>       4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
>       4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
>       3940352      962 mm/memory.c:4214 func:alloc_anon_folio
>       2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
>       ...
> 
> Since v5 [2]:
> - Added Reviewed-by and Acked-by, per Vlastimil Babka and Miguel Ojeda
> - Changed pgalloc_tag_{add|sub} to use number of pages instead of order, per Matthew Wilcox
> - Changed pgalloc_tag_sub_bytes to pgalloc_tag_sub_pages and adjusted the usage, per Matthew Wilcox
> - Moved static key check before prepare_slab_obj_exts_hook(), per Vlastimil Babka
> - Fixed RUST helper, per Miguel Ojeda
> - Fixed documentation, per Randy Dunlap
> - Rebased over mm-unstable
> 
> Usage:
> kconfig options:
>   - CONFIG_MEM_ALLOC_PROFILING
>   - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
>   - CONFIG_MEM_ALLOC_PROFILING_DEBUG
>     adds warnings for allocations that weren't accounted because of a
>     missing annotation
> 
> sysctl:
>    /proc/sys/vm/mem_profiling
> 
> Runtime info:
>    /proc/allocinfo
> 
> Notes:
> 
> [1]: Overhead
> To measure the overhead we are comparing the following configurations:
> (1) Baseline with CONFIG_MEMCG_KMEM=n
> (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
>      CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
> (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
>      CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
> (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &&
>      CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1)
> (5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT
> (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
>      CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  && CONFIG_MEMCG_KMEM=y
> (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &&
>      CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y
> 
> Performance overhead:
> To evaluate performance we implemented an in-kernel test executing
> multiple get_free_page/free_page and kmalloc/kfree calls with allocation
> sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
> affinity set to a specific CPU to minimize the noise. Below are results
> from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
> 56 core Intel Xeon:
> 
>                          kmalloc                 pgalloc
> (1 baseline)            6.764s                  16.902s
> (2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
> (3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
> (4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
> (5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
> (6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
> (7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)
> 
> Memory overhead:
> Kernel size:
> 
>     text           data        bss         dec         diff
> (1) 26515311	      18890222    17018880    62424413
> (2) 26524728	      19423818    16740352    62688898    264485
> (3) 26524724	      19423818    16740352    62688894    264481
> (4) 26524728	      19423818    16740352    62688898    264485
> (5) 26541782	      18964374    16957440    62463596    39183
> 
> Memory consumption on a 56 core Intel CPU with 125GB of memory:
> Code tags:           192 kB
> PageExts:         262144 kB (256MB)
> SlabExts:           9876 kB (9.6MB)
> PcpuExts:            512 kB (0.5MB)
> 
> Total overhead is 0.2% of total memory.
> 
> Benchmarks:
> 
> Hackbench tests run 100 times:
> hackbench -s 512 -l 200 -g 15 -f 25 -P
>        baseline       disabled profiling           enabled profiling
> avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
> stdev 0.0137         0.0188                       0.0077
> 
> 
> hackbench -l 10000
>        baseline       disabled profiling           enabled profiling
> avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
> stdev 0.0933         0.0286                       0.0489
> 
> stress-ng tests:
> stress-ng --class memory --seq 4 -t 60
> stress-ng --class cpu --seq 4 -t 60
> Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/
> 
> [2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/

If I enable this, I consistently get percpu allocation failures. I can 
occasionally reproduce it in qemu. I've attached the logs and my config, 
please let me know if there's anything else that could be relevant.

Kind regards,
Klara Modin
Download attachment "debug_alloc_profiling.log.gz" of type "application/gzip" (28378 bytes)

Download attachment "config.gz" of type "application/gzip" (38465 bytes)

Download attachment "qemu-alloc3.log.gz" of type "application/gzip" (14651 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ