lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bBLD-mZ4ne56Awxbiy0EGpJq69k5qUKZwcXVB1Rt581TQ@mail.gmail.com>
Date: Mon, 12 Feb 2024 19:14:20 -0500
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: akpm@...ux-foundation.org, kent.overstreet@...ux.dev, mhocko@...e.com, 
	vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de, 
	dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com, 
	corbet@....net, void@...ifault.com, peterz@...radead.org, 
	juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org, 
	arnd@...db.de, tglx@...utronix.de, mingo@...hat.com, 
	dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com, 
	david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org, 
	nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev, 
	rppt@...nel.org, paulmck@...nel.org, yosryahmed@...gle.com, yuzhao@...gle.com, 
	dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com, 
	keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com, 
	gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com, 
	vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org, 
	bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com, 
	penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com, 
	glider@...gle.com, elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com, 
	songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com, 
	minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com, 
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
	iommu@...ts.linux.dev, linux-arch@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, 
	linux-modules@...r.kernel.org, kasan-dev@...glegroups.com, 
	cgroups@...r.kernel.org
Subject: Re: [PATCH v3 00/35] Memory allocation profiling

On Mon, Feb 12, 2024 at 4:39 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> Memory allocation, v3 and final:
>
> Overview:
> Low overhead [1] per-callsite memory allocation profiling. Not just for debug
> kernels, overhead low enough to be deployed in production.
>
> We're aiming to get this in the next merge window, for 6.9. The feedback
> we've gotten has been that even out of tree this patchset has already
> been useful, and there's a significant amount of other work gated on the
> code tagging functionality included in this patchset [2].
>
> Example output:
>   root@...ia-kvm:~# sort -h /proc/allocinfo|tail
>    3.11MiB     2850 fs/ext4/super.c:1408 module:ext4 func:ext4_alloc_inode
>    3.52MiB      225 kernel/fork.c:356 module:fork func:alloc_thread_stack_node
>    3.75MiB      960 mm/page_ext.c:270 module:page_ext func:alloc_page_ext
>    4.00MiB        2 mm/khugepaged.c:893 module:khugepaged func:hpage_collapse_alloc_folio
>    10.5MiB      168 block/blk-mq.c:3421 module:blk_mq func:blk_mq_alloc_rqs
>    14.0MiB     3594 include/linux/gfp.h:295 module:filemap func:folio_alloc_noprof
>    26.8MiB     6856 include/linux/gfp.h:295 module:memory func:folio_alloc_noprof
>    64.5MiB    98315 fs/xfs/xfs_rmap_item.c:147 module:xfs func:xfs_rui_init
>    98.7MiB    25264 include/linux/gfp.h:295 module:readahead func:folio_alloc_noprof
>     125MiB     7357 mm/slub.c:2201 module:slub func:alloc_slab_page

This kind of memory profiling would be an incredible asset in cloud
environments.

Over the past year, we've encountered several kernel memory overhead
issues. Two particularly severe cases involved excessively large IOMMU
page tables (20GB per machine) and IOVA magazines (up to 8GB).
Considering thousands of machines were affected, the cumulative memory
waste was huge.

While we eventually resolved these issues with custom kernel profiling
hacks (some based on this series) and kdump analysis, comprehensive
memory profiling would have significantly accelerated the diagnostic
process, pinpointing the precise source of the allocations.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ