linux-kernel - Aw: Re: [External] : nfsd: memory leak when client does many file operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-0ed602bd-15d4-4110-b3f4-668c2051904a-1711472684521@msvc-mesg-gmx122>
Date: Tue, 26 Mar 2024 18:04:44 +0100
From: Jan Schunk <scpcom@....de>
To: Benjamin Coddington <bcodding@...hat.com>
Cc: Chuck Lever III <chuck.lever@...cle.com>, Jeff Layton
 <jlayton@...nel.org>, Neil Brown <neilb@...e.de>, Olga Kornievskaia
 <kolga@...app.com>, Dai Ngo <dai.ngo@...cle.com>, Tom Talpey
 <tom@...pey.com>, Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
 linux-kernel@...r.kernel.org
Subject: Aw: Re: [External] : nfsd: memory leak when client does many file
 operations

Before I start doing this on my own build I tried it with unmodified linux-image-6.6.13+bpo-amd64 from Debian 12.
I installed systemtap, linux-headers-6.6.13+bpo-amd64 and linux-image-6.6.13+bpo-amd64-dbg and tried to run stap:

user@deb:~$ sudo stap -v --all-modules kmem_alloc.stp nfsd_file
WARNING: Kernel function symbol table missing [man warning::symbols]
Pass 1: parsed user script and 484 library scripts using 110120virt/96896res/7168shr/89800data kb, in 1360usr/1080sys/4963real ms.
WARNING: cannot find module kernel debuginfo: No DWARF information found [man warning::debuginfo]
semantic error: resolution failed in DWARF builder

semantic error: while resolving probe point: identifier 'kernel' at kmem_alloc.stp:5:7
        source: probe kernel.function("kmem_cache_alloc") {
                      ^

semantic error: no match

Pass 2: analyzed script: 1 probe, 5 functions, 1 embed, 3 globals using 112132virt/100352res/8704shr/91792data kb, in 30usr/30sys/167real ms.
Pass 2: analysis failed.  [man error::pass2]
Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
user@deb:~$ 

user@deb:~$ grep -E 'CONFIG_DEBUG_INFO|CONFIG_KPROBES|CONFIG_DEBUG_FS|CONFIG_RELAY' /boot/config-6.6.13+bpo-amd64 
CONFIG_RELAY=y
CONFIG_KPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_NONE is not set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
# CONFIG_DEBUG_INFO_DWARF4 is not set
# CONFIG_DEBUG_INFO_DWARF5 is not set
# CONFIG_DEBUG_INFO_REDUCED is not set
CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
# CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_DEBUG_INFO_BTF=y
CONFIG_DEBUG_INFO_BTF_MODULES=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ALLOW_ALL=y
# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
# CONFIG_DEBUG_FS_ALLOW_NONE is not set
user@deb:~$ 

Do I need to enable other options?


> Gesendet: Dienstag, den 26.03.2024 um 12:15 Uhr
> Von: "Benjamin Coddington" <bcodding@...hat.com>
> An: "Chuck Lever III" <chuck.lever@...cle.com>
> Cc: "Jan Schunk" <scpcom@....de>, "Jeff Layton" <jlayton@...nel.org>, "Neil Brown" <neilb@...e.de>, "Olga Kornievskaia" <kolga@...app.com>, "Dai Ngo" <dai.ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>, "Linux NFS Mailing List" <linux-nfs@...r.kernel.org>, linux-kernel@...r.kernel.org
> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
> 
> On 25 Mar 2024, at 16:11, Chuck Lever III wrote:
> 
> >> On Mar 25, 2024, at 3:55 PM, Jan Schunk <scpcom@....de> wrote:
> >>
> >> The VM is now running 20 hours with 512MB RAM, no desktop, without the "noatime" mount option and without the "async" export option.
> >>
> >> Currently there is no issue, but the memory usage is still contantly growing. It may just take longer before something happens.
> >>
> >> top - 00:49:49 up 3 min,  1 user,  load average: 0,21, 0,19, 0,09
> >> Tasks: 111 total,   1 running, 110 sleeping,   0 stopped,   0 zombie
> >> %CPU(s):  0,2 us,  0,3 sy,  0,0 ni, 99,5 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
> >> MiB Spch:    467,0 total,    302,3 free,     89,3 used,     88,1 buff/cache
> >> MiB Swap:    975,0 total,    975,0 free,      0,0 used.    377,7 avail Spch
> >>
> >> top - 15:05:39 up 14:19,  1 user,  load average: 1,87, 1,72, 1,65
> >> Tasks: 104 total,   1 running, 103 sleeping,   0 stopped,   0 zombie
> >> %CPU(s):  0,2 us,  4,9 sy,  0,0 ni, 53,3 id, 39,0 wa,  0,0 hi,  2,6 si,  0,0 st
> >> MiB Spch:    467,0 total,     21,2 free,    147,1 used,    310,9 buff/cache
> >> MiB Swap:    975,0 total,    952,9 free,     22,1 used.    319,9 avail Spch
> >>
> >> top - 20:48:16 up 20:01,  1 user,  load average: 5,02, 2,72, 2,08
> >> Tasks: 104 total,   5 running,  99 sleeping,   0 stopped,   0 zombie
> >> %CPU(s):  0,2 us, 46,4 sy,  0,0 ni, 11,9 id,  2,3 wa,  0,0 hi, 39,2 si,  0,0 st
> >> MiB Spch:    467,0 total,     16,9 free,    190,8 used,    271,6 buff/cache
> >> MiB Swap:    975,0 total,    952,9 free,     22,1 used.    276,2 avail Spch
> >
> > I don't see anything in your original memory dump that
> > might account for this. But I'm at a loss because I'm
> > a kernel developer, not a support guy -- I don't have
> > any tools or expertise that can troubleshoot a system
> > without rebuilding a kernel with instrumentation. My
> > first instinct is to tell you to bisect between v6.3
> > and v6.4, or at least enable kmemleak, but I'm guessing
> > you don't build your own kernels.
> >
> > My only recourse at this point would be to try to
> > reproduce it myself, but unfortunately I've just
> > upgraded my whole lab to Fedora 39, and there's a grub
> > bug that prevents booting any custom-built kernel
> > on my hardware.
> >
> > So I'm stuck until I can nail that down. Anyone else
> > care to help out?
> 
> Sure - I can throw some stuff..
> 
> Can we dig into which memory slabs might be growing?  Something like:
> 
> watch -d "cat /proc/slabinfo | grep nfsd"
> 
> .. for a bit might show what is growing.
> 
> Then use a systemtap script like the one below to trace the allocations - use:
> 
> stap -v --all-modules kmem_alloc.stp <slab_name>
> 
> Ben
> 
> 
> 8<---------------------------- save as kmem_alloc.stp ----------------------------
> 
> # This script displays the number of given slab allocations and the backtraces leading up to it.
> 
> global slab = @1
> global stats, stacks
> probe kernel.function("kmem_cache_alloc") {
>         if (kernel_string($s->name) == slab) {
>                 stats[execname()] <<< 1
>                 stacks[execname(),kernel_string($s->name),backtrace()] <<< 1
>         }
> }
> # Exit after 10 seconds
> # probe timer.ms(10000) { exit () }
> probe end {
>         printf("Number of %s slab allocations by process\n", slab)
>         foreach ([exec] in stats) {
>                 printf("%s:\t%d\n",exec,@count(stats[exec]))
>         }
>         printf("\nBacktrace of processes when allocating\n")
>         foreach ([proc,cache,bt] in stacks) {
>                 printf("Exec: %s Name: %s  Count: %d\n",proc,cache,@count(stacks[proc,cache,bt]))
>                 print_stack(bt)
>                 printf("\n-------------------------------------------------------\n\n")
>         }
> }
>