[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8c8e8dc4d30a8ca37a57d7f29c5f29cdf7a904ee.camel@ibm.com>
Date: Mon, 15 Dec 2025 19:42:56 +0000
From: Viacheslav Dubeyko <Slava.Dubeyko@....com>
To: "malcolm@...k.id.au" <malcolm@...k.id.au>,
"00107082@....com"
<00107082@....com>
CC: "ceph-devel@...r.kernel.org" <ceph-devel@...r.kernel.org>,
Xiubo Li
<xiubli@...hat.com>,
"idryomov@...il.com" <idryomov@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"surenb@...gle.com" <surenb@...gle.com>
Subject: RE: RRe: Possible memory leak in 6.17.7
Hi Mal,
On Thu, 2025-12-11 at 14:23 +1000, Mal Haak wrote:
> On Thu, 11 Dec 2025 11:28:21 +0800 (CST)
> "David Wang" <00107082@....com> wrote:
>
> > At 2025-12-10 21:43:18, "Mal Haak" <malcolm@...k.id.au> wrote:
> > > On Tue, 9 Dec 2025 12:40:21 +0800 (CST)
> > > "David Wang" <00107082@....com> wrote:
> > >
> > > > At 2025-12-09 07:08:31, "Mal Haak" <malcolm@...k.id.au> wrote:
> > > > > On Mon, 8 Dec 2025 19:08:29 +0800
> > > > > David Wang <00107082@....com> wrote:
> > > > >
> > > > > > On Mon, 10 Nov 2025 18:20:08 +1000
> > > > > > Mal Haak <malcolm@...k.id.au> wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > I have found a memory leak in 6.17.7 but I am unsure how to
> > > > > > > track it down effectively.
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > I think the `memory allocation profiling` feature can help.
> > > > > > https://docs.kernel.org/mm/allocation-profiling.html
> > > > > >
> > > > > > You would need to build a kernel with
> > > > > > CONFIG_MEM_ALLOC_PROFILING=y
> > > > > > CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
> > > > > >
> > > > > > And check /proc/allocinfo for the suspicious allocations which
> > > > > > take more memory than expected.
> > > > > >
> > > > > > (I once caught a nvidia driver memory leak.)
> > > > > >
> > > > > >
> > > > > > FYI
> > > > > > David
> > > > > >
> > > > >
> > > > > Thank you for your suggestion. I have some results.
> > > > >
> > > > > Ran the rsync workload for about 9 hours. It started to look like
> > > > > it was happening.
> > > > > # smem -pw
> > > > > Area Used Cache Noncache
> > > > > firmware/hardware 0.00% 0.00% 0.00%
> > > > > kernel image 0.00% 0.00% 0.00%
> > > > > kernel dynamic memory 80.46% 65.80% 14.66%
> > > > > userspace memory 0.35% 0.16% 0.19%
> > > > > free memory 19.19% 19.19% 0.00%
> > > > > # sort -g /proc/allocinfo|tail|numfmt --to=iec
> > > > > 22M 5609 mm/memory.c:1190 func:folio_prealloc
> > > > > 23M 1932 fs/xfs/xfs_buf.c:226 [xfs]
> > > > > func:xfs_buf_alloc_backing_mem
> > > > > 24M 24135 fs/xfs/xfs_icache.c:97 [xfs]
> > > > > func:xfs_inode_alloc 27M 6693 mm/memory.c:1192
> > > > > func:folio_prealloc 58M 14784 mm/page_ext.c:271
> > > > > func:alloc_page_ext 258M 129 mm/khugepaged.c:1069
> > > > > func:alloc_charge_folio 430M 770788 lib/xarray.c:378
> > > > > func:xas_alloc 545M 36444 mm/slub.c:3059 func:alloc_slab_page
> > > > > 9.8G 2563617 mm/readahead.c:189 func:ractl_alloc_folio
> > > > > 20G 5164004 mm/filemap.c:2012 func:__filemap_get_folio
> > > > >
> > > > >
> > > > > So I stopped the workload and dropped caches to confirm.
> > > > >
> > > > > # echo 3 > /proc/sys/vm/drop_caches
> > > > > # smem -pw
> > > > > Area Used Cache Noncache
> > > > > firmware/hardware 0.00% 0.00% 0.00%
> > > > > kernel image 0.00% 0.00% 0.00%
> > > > > kernel dynamic memory 33.45% 0.09% 33.36%
> > > > > userspace memory 0.36% 0.16% 0.19%
> > > > > free memory 66.20% 66.20% 0.00%
> > > > > # sort -g /proc/allocinfo|tail|numfmt --to=iec
> > > > > 12M 2987 mm/execmem.c:41 func:execmem_vmalloc
> > > > > 12M 3 kernel/dma/pool.c:96
> > > > > func:atomic_pool_expand 13M 751 mm/slub.c:3061
> > > > > func:alloc_slab_page 16M 8 mm/khugepaged.c:1069
> > > > > func:alloc_charge_folio 18M 4355 mm/memory.c:1190
> > > > > func:folio_prealloc 24M 6119 mm/memory.c:1192
> > > > > func:folio_prealloc 58M 14784 mm/page_ext.c:271
> > > > > func:alloc_page_ext 61M 15448 mm/readahead.c:189
> > > > > func:ractl_alloc_folio 79M 6726 mm/slub.c:3059
> > > > > func:alloc_slab_page 11G 2674488 mm/filemap.c:2012
> > > > > func:__filemap_get_folio
> >
> > Maybe narrowing down the "Noncache" caller of __filemap_get_folio
> > would help clarify things. (It could be designed that way, and needs
> > other route than dropping-cache to release the memory, just
> > guess....) If you want, you can modify code to split the accounting
> > for __filemap_get_folio according to its callers.
>
>
> Thanks again, I'll add this patch in and see where I end up.
>
> The issue is nothing will cause the memory to be freed. Dropping caches
> doesn't work, memory pressure doesn't work, unmounting the filesystems
> doesn't work. Removing the cephfs and netfs kernel modules also doesn't
> work.
>
> This is why I feel it's a ref_count (or similar) issue.
>
> I've also found it seems to be a fixed amount leaked each time, per
> file. Simply doing lots of IO on one large file doesn't leak as fast as
> lots of "small" (greater than 10MB less than 100MB seems to be a sweet
> spot)
>
> Also, dropping caches while the workload is running actually amplifies
> the issue. So it very much feels like something is wrong in the reclaim
> code.
>
> Anyway I'll get this patch applied and see where I end up.
>
> I now have crash dumps (after enabling crash_on_oom) so I'm going to
> try and see if I can find these structures and see what state they are
> in
>
>
Thanks a lot for reporting the issue. Finally, I can see the discussion in email
list. :) Are you working on the patch with the fix? Should we wait for the fix
or I need to start the issue reproduction and investigation? I am simply trying
to avoid patches collision and, also, I have multiple other issues for the fix
in CephFS kernel client. :)
Thanks,
Slava.
Powered by blists - more mailing lists