[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bBLD-mZ4ne56Awxbiy0EGpJq69k5qUKZwcXVB1Rt581TQ@mail.gmail.com>
Date: Mon, 12 Feb 2024 19:14:20 -0500
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: akpm@...ux-foundation.org, kent.overstreet@...ux.dev, mhocko@...e.com,
vbabka@...e.cz, hannes@...xchg.org, roman.gushchin@...ux.dev, mgorman@...e.de,
dave@...olabs.net, willy@...radead.org, liam.howlett@...cle.com,
corbet@....net, void@...ifault.com, peterz@...radead.org,
juri.lelli@...hat.com, catalin.marinas@....com, will@...nel.org,
arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
nathan@...nel.org, dennis@...nel.org, tj@...nel.org, muchun.song@...ux.dev,
rppt@...nel.org, paulmck@...nel.org, yosryahmed@...gle.com, yuzhao@...gle.com,
dhowells@...hat.com, hughd@...gle.com, andreyknvl@...il.com,
keescook@...omium.org, ndesaulniers@...gle.com, vvvvvv@...gle.com,
gregkh@...uxfoundation.org, ebiggers@...gle.com, ytcoode@...il.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
glider@...gle.com, elver@...gle.com, dvyukov@...gle.com, shakeelb@...gle.com,
songmuchun@...edance.com, jbaron@...mai.com, rientjes@...gle.com,
minchan@...gle.com, kaleshsingh@...gle.com, kernel-team@...roid.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, linux-arch@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
linux-modules@...r.kernel.org, kasan-dev@...glegroups.com,
cgroups@...r.kernel.org
Subject: Re: [PATCH v3 00/35] Memory allocation profiling
On Mon, Feb 12, 2024 at 4:39 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> Memory allocation, v3 and final:
>
> Overview:
> Low overhead [1] per-callsite memory allocation profiling. Not just for debug
> kernels, overhead low enough to be deployed in production.
>
> We're aiming to get this in the next merge window, for 6.9. The feedback
> we've gotten has been that even out of tree this patchset has already
> been useful, and there's a significant amount of other work gated on the
> code tagging functionality included in this patchset [2].
>
> Example output:
> root@...ia-kvm:~# sort -h /proc/allocinfo|tail
> 3.11MiB 2850 fs/ext4/super.c:1408 module:ext4 func:ext4_alloc_inode
> 3.52MiB 225 kernel/fork.c:356 module:fork func:alloc_thread_stack_node
> 3.75MiB 960 mm/page_ext.c:270 module:page_ext func:alloc_page_ext
> 4.00MiB 2 mm/khugepaged.c:893 module:khugepaged func:hpage_collapse_alloc_folio
> 10.5MiB 168 block/blk-mq.c:3421 module:blk_mq func:blk_mq_alloc_rqs
> 14.0MiB 3594 include/linux/gfp.h:295 module:filemap func:folio_alloc_noprof
> 26.8MiB 6856 include/linux/gfp.h:295 module:memory func:folio_alloc_noprof
> 64.5MiB 98315 fs/xfs/xfs_rmap_item.c:147 module:xfs func:xfs_rui_init
> 98.7MiB 25264 include/linux/gfp.h:295 module:readahead func:folio_alloc_noprof
> 125MiB 7357 mm/slub.c:2201 module:slub func:alloc_slab_page
This kind of memory profiling would be an incredible asset in cloud
environments.
Over the past year, we've encountered several kernel memory overhead
issues. Two particularly severe cases involved excessively large IOMMU
page tables (20GB per machine) and IOVA magazines (up to 8GB).
Considering thousands of machines were affected, the cumulative memory
waste was huge.
While we eventually resolved these issues with custom kernel profiling
hacks (some based on this series) and kdump analysis, comprehensive
memory profiling would have significantly accelerated the diagnostic
process, pinpointing the precise source of the allocations.
Powered by blists - more mailing lists