linux-kernel - Re: Possible memory leak in 6.17.7

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251208110829.11840-1-00107082@163.com>
Date: Mon,  8 Dec 2025 19:08:29 +0800
From: David Wang <00107082@....com>
To: malcolm@...k.id.au
Cc: linux-kernel@...r.kernel.org,
	surenb@...gle.com
Subject: Re: Possible memory leak in 6.17.7


On Mon, 10 Nov 2025 18:20:08 +1000
Mal Haak <malcolm@...k.id.au> wrote:
> Hello,
> 
> I have found a memory leak in 6.17.7 but I am unsure how to track it
> down effectively.
> 
> I am running a server that has a heavy read/write workload to a cephfs
> file system. It is a VM. 
> 
> Over time it appears that the non-cache useage of kernel dynamic
> memory increases. The kernel seems to think the pages are reclaimable
> however nothing appears to trigger the reclaim. This leads to
> workloads getting killed via oomkiller. 
> 
> smem -wp output:
> 
> Area                           Used      Cache   Noncache 
> firmware/hardware             0.00%      0.00%      0.00% 
> kernel image                  0.00%      0.00%      0.00% 
> kernel dynamic memory        88.21%     36.25%     51.96% 
> userspace memory              9.49%      0.15%      9.34% 
> free memory                   2.30%      2.30%      0.00% 
> 
> free -h output:
> 
>        total  used   free   shared  buff/cache available 
> Mem:   31Gi   3.6Gi  500Mi  4.0Mi   11Gi      27Gi 
> Swap:  4.0Gi  179Mi  3.8Gi
> 
> Reverting to the previous LTS fixes the issue
> 
> smem -wp output:
> Area                           Used      Cache   Noncache 
> firmware/hardware             0.00%      0.00%      0.00% 
> kernel image                  0.00%      0.00%      0.00% 
> kernel dynamic memory        80.22%     79.32%      0.90% 
> userspace memory             10.48%      0.20%     10.28% 
> free memory                   9.30%      9.30%      0.00% 
> 

I think the `memory allocation profiling` feature can help.
https://docs.kernel.org/mm/allocation-profiling.html

You would need to build a kernel with 
CONFIG_MEM_ALLOC_PROFILING=y
CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y

And check /proc/allocinfo for the suspicious allocations which take
more memory than expected.

(I once caught a nvidia driver memory leak.)


FYI
David