[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251209090831.13c7a639@xps15mal>
Date: Tue, 9 Dec 2025 09:08:31 +1000
From: Mal Haak <malcolm@...k.id.au>
To: linux-kernel@...r.kernel.org, surenb@...gle.com, David Wang
<00107082@....com>
Subject: Re: Possible memory leak in 6.17.7
On Mon, 8 Dec 2025 19:08:29 +0800
David Wang <00107082@....com> wrote:
> On Mon, 10 Nov 2025 18:20:08 +1000
> Mal Haak <malcolm@...k.id.au> wrote:
> > Hello,
> >
> > I have found a memory leak in 6.17.7 but I am unsure how to track it
> > down effectively.
> >
> >
>
> I think the `memory allocation profiling` feature can help.
> https://docs.kernel.org/mm/allocation-profiling.html
>
> You would need to build a kernel with
> CONFIG_MEM_ALLOC_PROFILING=y
> CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
>
> And check /proc/allocinfo for the suspicious allocations which take
> more memory than expected.
>
> (I once caught a nvidia driver memory leak.)
>
>
> FYI
> David
>
Thank you for your suggestion. I have some results.
Ran the rsync workload for about 9 hours. It started to look like it
was happening.
# smem -pw
Area Used Cache Noncache
firmware/hardware 0.00% 0.00% 0.00%
kernel image 0.00% 0.00% 0.00%
kernel dynamic memory 80.46% 65.80% 14.66%
userspace memory 0.35% 0.16% 0.19%
free memory 19.19% 19.19% 0.00%
# sort -g /proc/allocinfo|tail|numfmt --to=iec
22M 5609 mm/memory.c:1190 func:folio_prealloc
23M 1932 fs/xfs/xfs_buf.c:226 [xfs]
func:xfs_buf_alloc_backing_mem
24M 24135 fs/xfs/xfs_icache.c:97 [xfs] func:xfs_inode_alloc
27M 6693 mm/memory.c:1192 func:folio_prealloc
58M 14784 mm/page_ext.c:271 func:alloc_page_ext
258M 129 mm/khugepaged.c:1069 func:alloc_charge_folio
430M 770788 lib/xarray.c:378 func:xas_alloc
545M 36444 mm/slub.c:3059 func:alloc_slab_page
9.8G 2563617 mm/readahead.c:189 func:ractl_alloc_folio
20G 5164004 mm/filemap.c:2012 func:__filemap_get_folio
So I stopped the workload and dropped caches to confirm.
# echo 3 > /proc/sys/vm/drop_caches
# smem -pw
Area Used Cache Noncache
firmware/hardware 0.00% 0.00% 0.00%
kernel image 0.00% 0.00% 0.00%
kernel dynamic memory 33.45% 0.09% 33.36%
userspace memory 0.36% 0.16% 0.19%
free memory 66.20% 66.20% 0.00%
# sort -g /proc/allocinfo|tail|numfmt --to=iec
12M 2987 mm/execmem.c:41 func:execmem_vmalloc
12M 3 kernel/dma/pool.c:96 func:atomic_pool_expand
13M 751 mm/slub.c:3061 func:alloc_slab_page
16M 8 mm/khugepaged.c:1069 func:alloc_charge_folio
18M 4355 mm/memory.c:1190 func:folio_prealloc
24M 6119 mm/memory.c:1192 func:folio_prealloc
58M 14784 mm/page_ext.c:271 func:alloc_page_ext
61M 15448 mm/readahead.c:189 func:ractl_alloc_folio
79M 6726 mm/slub.c:3059 func:alloc_slab_page
11G 2674488 mm/filemap.c:2012 func:__filemap_get_folio
So if I'm reading this correctly something is causing folios collect
and not be able to be freed?
Also it's clear that some of the folio's are counting as cache and some
aren't.
Like I said 6.17 and 6.18 both have the issue. 6.12 does not. I'm now
going to manually walk through previous kernel releases and find
where it first starts happening purely because I'm having issues
building earlier kernels due to rust stuff and other python
incompatibilities making doing a git-bisect a bit fun.
I'll do it the packages way until I get closer, then solve the build
issues.
Thanks,
Mal
Powered by blists - more mailing lists