[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251210234318.5d8c2d68@xps15mal>
Date: Wed, 10 Dec 2025 23:43:18 +1000
From: Mal Haak <malcolm@...k.id.au>
To: "David Wang" <00107082@....com>
Cc: linux-kernel@...r.kernel.org, surenb@...gle.com, xiubli@...hat.com,
idryomov@...il.com, ceph-devel@...r.kernel.org
Subject: Re: Possible memory leak in 6.17.7
On Tue, 9 Dec 2025 12:40:21 +0800 (CST)
"David Wang" <00107082@....com> wrote:
> At 2025-12-09 07:08:31, "Mal Haak" <malcolm@...k.id.au> wrote:
> >On Mon, 8 Dec 2025 19:08:29 +0800
> >David Wang <00107082@....com> wrote:
> >
> >> On Mon, 10 Nov 2025 18:20:08 +1000
> >> Mal Haak <malcolm@...k.id.au> wrote:
> >> > Hello,
> >> >
> >> > I have found a memory leak in 6.17.7 but I am unsure how to
> >> > track it down effectively.
> >> >
> >> >
> >>
> >> I think the `memory allocation profiling` feature can help.
> >> https://docs.kernel.org/mm/allocation-profiling.html
> >>
> >> You would need to build a kernel with
> >> CONFIG_MEM_ALLOC_PROFILING=y
> >> CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
> >>
> >> And check /proc/allocinfo for the suspicious allocations which take
> >> more memory than expected.
> >>
> >> (I once caught a nvidia driver memory leak.)
> >>
> >>
> >> FYI
> >> David
> >>
> >
> >Thank you for your suggestion. I have some results.
> >
> >Ran the rsync workload for about 9 hours. It started to look like it
> >was happening.
> ># smem -pw
> >Area Used Cache Noncache
> >firmware/hardware 0.00% 0.00% 0.00%
> >kernel image 0.00% 0.00% 0.00%
> >kernel dynamic memory 80.46% 65.80% 14.66%
> >userspace memory 0.35% 0.16% 0.19%
> >free memory 19.19% 19.19% 0.00%
> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
> > 22M 5609 mm/memory.c:1190 func:folio_prealloc
> > 23M 1932 fs/xfs/xfs_buf.c:226 [xfs]
> >func:xfs_buf_alloc_backing_mem
> > 24M 24135 fs/xfs/xfs_icache.c:97 [xfs]
> > func:xfs_inode_alloc 27M 6693 mm/memory.c:1192
> > func:folio_prealloc 58M 14784 mm/page_ext.c:271
> > func:alloc_page_ext 258M 129 mm/khugepaged.c:1069
> > func:alloc_charge_folio 430M 770788 lib/xarray.c:378
> > func:xas_alloc 545M 36444 mm/slub.c:3059 func:alloc_slab_page
> > 9.8G 2563617 mm/readahead.c:189 func:ractl_alloc_folio
> > 20G 5164004 mm/filemap.c:2012 func:__filemap_get_folio
> >
> >
> >So I stopped the workload and dropped caches to confirm.
> >
> ># echo 3 > /proc/sys/vm/drop_caches
> ># smem -pw
> >Area Used Cache Noncache
> >firmware/hardware 0.00% 0.00% 0.00%
> >kernel image 0.00% 0.00% 0.00%
> >kernel dynamic memory 33.45% 0.09% 33.36%
> >userspace memory 0.36% 0.16% 0.19%
> >free memory 66.20% 66.20% 0.00%
> ># sort -g /proc/allocinfo|tail|numfmt --to=iec
> > 12M 2987 mm/execmem.c:41 func:execmem_vmalloc
> > 12M 3 kernel/dma/pool.c:96 func:atomic_pool_expand
> > 13M 751 mm/slub.c:3061 func:alloc_slab_page
> > 16M 8 mm/khugepaged.c:1069 func:alloc_charge_folio
> > 18M 4355 mm/memory.c:1190 func:folio_prealloc
> > 24M 6119 mm/memory.c:1192 func:folio_prealloc
> > 58M 14784 mm/page_ext.c:271 func:alloc_page_ext
> > 61M 15448 mm/readahead.c:189 func:ractl_alloc_folio
> > 79M 6726 mm/slub.c:3059 func:alloc_slab_page
> > 11G 2674488 mm/filemap.c:2012 func:__filemap_get_folio
> >
> >So if I'm reading this correctly something is causing folios collect
> >and not be able to be freed?
>
> CC cephfs, maybe someone could have an easy reading out of those
> folio usage
>
>
> >
> >Also it's clear that some of the folio's are counting as cache and
> >some aren't.
> >
> >Like I said 6.17 and 6.18 both have the issue. 6.12 does not. I'm now
> >going to manually walk through previous kernel releases and find
> >where it first starts happening purely because I'm having issues
> >building earlier kernels due to rust stuff and other python
> >incompatibilities making doing a git-bisect a bit fun.
> >
> >I'll do it the packages way until I get closer, then solve the build
> >issues.
> >
> >Thanks,
> >Mal
> >
Thanks David.
I've contacted the ceph developers as well.
There was a suggestion it was due to the change from, to quote:
folio.free() to folio.put() or something like this.
The change happened around 6.14/6.15
I've found an easier reproducer.
There has been a suggestion that perhaps the ceph team might not fix
this as "you can just reboot before the machine becomes unstable" and
"Nobody else has encountered this bug"
I'll leave that to other people to make a call on but I'd assume the
lack of reports is due to the fact that most stable distros are still
on a a far too early kernel and/or are using the fuse driver with k8s.
Anyway, thanks for your assistance.
Powered by blists - more mailing lists