[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251208110829.11840-1-00107082@163.com>
Date: Mon, 8 Dec 2025 19:08:29 +0800
From: David Wang <00107082@....com>
To: malcolm@...k.id.au
Cc: linux-kernel@...r.kernel.org,
surenb@...gle.com
Subject: Re: Possible memory leak in 6.17.7
On Mon, 10 Nov 2025 18:20:08 +1000
Mal Haak <malcolm@...k.id.au> wrote:
> Hello,
>
> I have found a memory leak in 6.17.7 but I am unsure how to track it
> down effectively.
>
> I am running a server that has a heavy read/write workload to a cephfs
> file system. It is a VM.
>
> Over time it appears that the non-cache useage of kernel dynamic
> memory increases. The kernel seems to think the pages are reclaimable
> however nothing appears to trigger the reclaim. This leads to
> workloads getting killed via oomkiller.
>
> smem -wp output:
>
> Area Used Cache Noncache
> firmware/hardware 0.00% 0.00% 0.00%
> kernel image 0.00% 0.00% 0.00%
> kernel dynamic memory 88.21% 36.25% 51.96%
> userspace memory 9.49% 0.15% 9.34%
> free memory 2.30% 2.30% 0.00%
>
> free -h output:
>
> total used free shared buff/cache available
> Mem: 31Gi 3.6Gi 500Mi 4.0Mi 11Gi 27Gi
> Swap: 4.0Gi 179Mi 3.8Gi
>
> Reverting to the previous LTS fixes the issue
>
> smem -wp output:
> Area Used Cache Noncache
> firmware/hardware 0.00% 0.00% 0.00%
> kernel image 0.00% 0.00% 0.00%
> kernel dynamic memory 80.22% 79.32% 0.90%
> userspace memory 10.48% 0.20% 10.28%
> free memory 9.30% 9.30% 0.00%
>
I think the `memory allocation profiling` feature can help.
https://docs.kernel.org/mm/allocation-profiling.html
You would need to build a kernel with
CONFIG_MEM_ALLOC_PROFILING=y
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y
And check /proc/allocinfo for the suspicious allocations which take
more memory than expected.
(I once caught a nvidia driver memory leak.)
FYI
David
Powered by blists - more mailing lists