lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6F16BCCE-3000-4BCB-A3B4-95B4767E3577@oracle.com>
Date: Sun, 24 Mar 2024 22:13:28 +0000
From: Chuck Lever III <chuck.lever@...cle.com>
To: Jan Schunk <scpcom@....de>
CC: Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
        Olga
 Kornievskaia <kolga@...app.com>, Dai Ngo <dai.ngo@...cle.com>,
        Tom Talpey
	<tom@...pey.com>,
        Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [External] : nfsd: memory leak when client does many file
 operations



> On Mar 24, 2024, at 5:39 PM, Jan Schunk <scpcom@....de> wrote:
> 
> Yes, the VM is x86_64.
> 
> "pgrep -c nfsd" says: 9
> 
> I use NFS version 3.
> 
> All network ports are connected with 1GBit/s.
> 
> The exported file system is ext4.
> 
> I do not use any authentication.
> 
> The mount options in /etc/fstab are:
> rw,noatime,nfsvers=3,proto=tcp,hard,nointr,timeo=600,rsize=32768,wsize=32768,noauto
> 
> The line in /etc/exports:
> /export/data3 192.168.0.0/16(fsid=<uuid>,rw,no_root_squash,async,no_subtree_check)

Is it possible to reproduce this issue without the "noatime"
mount option and without the "async" export option?


>> Gesendet: Sonntag, den 24.03.2024 um 22:10 Uhr
>> Von: "Chuck Lever III" <chuck.lever@...cle.com>
>> An: "Jan Schunk" <scpcom@....de>
>> Cc: "Jeff Layton" <jlayton@...nel.org>, "Neil Brown" <neilb@...e.de>, "Olga Kornievskaia" <kolga@...app.com>, "Dai Ngo" <dai.ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>, "Linux NFS Mailing List" <linux-nfs@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
>> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
>> 
>> 
>>> On Mar 24, 2024, at 4:48 PM, Jan Schunk <scpcom@....de> wrote:
>>> 
>>> The "heavy usage" is a simple script runinng on the client and does the following:
>>> 1. Create a empty git repository on the share
>>> 2. Unpacking a tar.gz archive (Qnap GPL source code)
>>> 3. Remove some folders/files
>>> 4. Use diff to compare it with an older version
>>> 5. commit them to the git
>>> 6. Repeat at step 2 with next archive
>>> 
>>> On my armhf NAS the other memory consuming workload is an SMB server.
>> 
>> I'm not sure any of us has a Freescale system to try this ...
>> 
>> 
>>> On the test VM the other memory consuming workload is a GNOME desktop.
>> 
>> ... and so I'm hoping this VM is an x86_64 system.
>> 
>> 
>>> But it does not make much difference if I stop other services it just takes a bit longer until the same issue happens.
>>> The size of swap also does not make a difference.
>> 
>> What is the nfsd thread count on the server? 'pgrep -c nfsd'
>> 
>> What version of NFS does your client mount with?
>> 
>> What is the speed of the network between your client and server?
>> 
>> What is the type of the exported file system?
>> 
>> Do you use NFS with Kerberos?
>> 
>> 
>>>> Gesendet: Sonntag, den 24.03.2024 um 21:14 Uhr
>>>> Von: "Chuck Lever III" <chuck.lever@...cle.com>
>>>> An: "Jan Schunk" <scpcom@....de>
>>>> Cc: "Jeff Layton" <jlayton@...nel.org>, "Neil Brown" <neilb@...e.de>, "Olga Kornievskaia" <kolga@...app.com>, "Dai Ngo" <dai.ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>, "Linux NFS Mailing List" <linux-nfs@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
>>>> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
>>>> 
>>>> 
>>>> 
>>>>> On Mar 24, 2024, at 3:57 PM, Jan Schunk <scpcom@....de> wrote:
>>>>> 
>>>>> Issue found on: v6.5.13 v6.6.13, v6.6.14, v6.6.20 and v6.8.1
>>>>> Not found on: v6.4, v6.1.82 and below
>>>>> Architectures: amd64 and arm(hf)
>>>>> 
>>>>> Steps to reproduce:
>>>>> - Create a VM with 1GB RAM
>>>>> - Install Debian 12
>>>>> - Install linux-image-6.6.13+bpo-amd64-unsigned and nfs-kernel-server
>>>>> - Export some folder
>>>>> On the client:
>>>>> - Mount the share
>>>>> - Run a script that does produce heavy usage on the share (like unpacking large tar archives that cointain many small files into a git and commiting them)
>>>> 
>>>> Hi Jan, thanks for the report.
>>>> 
>>>> The "produce heavy usage" instruction here is pretty vague.
>>>> I run CI testing with kmemleak enabled, and have not seen
>>>> any leaks on recent kernels when running the git regression
>>>> tests, which are similar to this kind of workload.
>>>> 
>>>> Can you try to narrow the reproducer for us, even just a
>>>> little? What client action exactly is triggering the memory
>>>> leak? Is there any other workload on your NFS server that
>>>> might be consuming memory?
>>>> 
>>>> 
>>>>> On my setup it takes 20-40 hours until the memory is full and oom-kill gets hired by nfsd to kill other processes. the memory stays full and the system reboots:
>>>>> 
>>>>> [121969.590000] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,task=dbus-daemon,pid=454,uid=101
>>>>> [121969.600000] Out of memory: Killed process 454 (dbus-daemon) total-vm:6196kB, anon-rss:128kB, file-rss:1408kB, shmem-rss:0kB, UID:101 pgtables:12kB oom_score_adj:-900
>>>>> [121971.700000] oom_reaper: reaped process 454 (dbus-daemon), now anon-rss:0kB, file-rss:64kB, shmem-rss:0kB
>>>>> [121971.920000] nfsd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
>>>>> [121971.930000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
>>>>> [121971.930000] Hardware name: Freescale LS1024A
>>>>> [121971.940000]  unwind_backtrace from show_stack+0xb/0xc
>>>>> [121971.940000]  show_stack from dump_stack_lvl+0x2b/0x34
>>>>> [121971.950000]  dump_stack_lvl from dump_header+0x35/0x212
>>>>> [121971.950000]  dump_header from out_of_memory+0x317/0x34c
>>>>> [121971.960000]  out_of_memory from __alloc_pages+0x8e7/0xbb0
>>>>> [121971.970000]  __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8
>>>>> [121971.970000]  __alloc_pages_bulk from svc_recv+0x9d/0x7d4
>>>>> [121971.980000]  svc_recv from nfsd+0x7d/0xd4
>>>>> [121971.980000]  nfsd from kthread+0xb9/0xcc
>>>>> [121971.990000]  kthread from ret_from_fork+0x11/0x1c
>>>>> [121971.990000] Exception stack(0xc2cadfb0 to 0xc2cadff8)
>>>>> [121971.990000] dfa0:                                     00000000 00000000 00000000 00000000
>>>>> [121972.000000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>>>>> [121972.010000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>>>>> [121972.020000] Mem-Info:
>>>>> [121972.020000] active_anon:101 inactive_anon:127 isolated_anon:29
>>>>> [121972.020000]  active_file:1200 inactive_file:1204 isolated_file:98
>>>>> [121972.020000]  unevictable:394 dirty:296 writeback:17
>>>>> [121972.020000]  slab_reclaimable:13680 slab_unreclaimable:4350
>>>>> [121972.020000]  mapped:637 shmem:4 pagetables:414
>>>>> [121972.020000]  sec_pagetables:0 bounce:0
>>>>> [121972.020000]  kernel_misc_reclaimable:0
>>>>> [121972.020000]  free:7279 free_pcp:184 free_cma:1094
>>>>> [121972.060000] Node 0 active_anon:404kB inactive_anon:508kB active_file:4736kB inactive_file:4884kB unevictable:1576kB isolated(anon):116kB isolated(file):388kB mapped:2548kB dirty:1184kB writeback:68kB shmem:16kB writeback_tmp:0kB kernel_stack:1088kB pagetables:1656kB sec_pagetables:0kB all_unreclaimable? no
>>>>> [121972.090000] Normal free:29116kB boost:18432kB min:26624kB low:28672kB high:30720kB reserved_highatomic:0KB active_anon:404kB inactive_anon:712kB active_file:4788kB inactive_file:4752kB unevictable:1576kB writepending:1252kB present:1048576kB managed:1011988kB mlocked:1576kB bounce:0kB free_pcp:736kB local_pcp:236kB free_cma:4376kB
>>>>> [121972.120000] lowmem_reserve[]: 0 0
>>>>> [121972.120000] Normal: 2137*4kB (UEC) 1173*8kB (UEC) 529*16kB (UEC) 19*32kB (UC) 7*64kB (C) 5*128kB (C) 2*256kB (C) 1*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 29116kB
>>>>> [121972.140000] 2991 total pagecache pages
>>>>> [121972.140000] 166 pages in swap cache
>>>>> [121972.140000] Free swap  = 93424kB
>>>>> [121972.150000] Total swap = 102396kB
>>>>> [121972.150000] 262144 pages RAM
>>>>> [121972.150000] 0 pages HighMem/MovableOnly
>>>>> [121972.160000] 9147 pages reserved
>>>>> [121972.160000] 4096 pages cma reserved
>>>>> [121972.160000] Unreclaimable slab info:
>>>>> [121972.170000] Name                      Used          Total
>>>>> [121972.170000] bio-88                    64KB         64KB
>>>>> [121972.180000] TCPv6                     61KB         61KB
>>>>> [121972.180000] bio-76                    16KB         16KB
>>>>> [121972.190000] bio-188                   11KB         11KB
>>>>> [121972.190000] nfs_read_data             22KB         22KB
>>>>> [121972.200000] kioctx                    15KB         15KB
>>>>> [121972.200000] posix_timers_cache          7KB          7KB
>>>>> [121972.210000] UDP                       63KB         63KB
>>>>> [121972.220000] tw_sock_TCP                3KB          3KB
>>>>> [121972.220000] request_sock_TCP           3KB          3KB
>>>>> [121972.230000] TCP                       62KB         62KB
>>>>> [121972.230000] bio-168                    7KB          7KB
>>>>> [121972.240000] ep_head                    8KB          8KB
>>>>> [121972.240000] request_queue             15KB         15KB
>>>>> [121972.250000] bio-124                   18KB         40KB
>>>>> [121972.250000] biovec-max               264KB        264KB
>>>>> [121972.260000] biovec-128                63KB         63KB
>>>>> [121972.260000] biovec-64                157KB        157KB
>>>>> [121972.270000] skbuff_small_head         94KB         94KB
>>>>> [121972.270000] skbuff_fclone_cache         55KB         63KB
>>>>> [121972.280000] skbuff_head_cache         59KB         59KB
>>>>> [121972.280000] fsnotify_mark_connector         16KB         28KB
>>>>> [121972.290000] sigqueue                  19KB         31KB
>>>>> [121972.300000] shmem_inode_cache       1622KB       1662KB
>>>>> [121972.300000] kernfs_iattrs_cache         15KB         15KB
>>>>> [121972.310000] kernfs_node_cache       2107KB       2138KB
>>>>> [121972.310000] filp                     259KB        315KB
>>>>> [121972.320000] net_namespace             30KB         30KB
>>>>> [121972.320000] uts_namespace             15KB         15KB
>>>>> [121972.330000] vma_lock                 143KB        179KB
>>>>> [121972.330000] vm_area_struct           459KB        553KB
>>>>> [121972.340000] sighand_cache            191KB        220KB
>>>>> [121972.340000] task_struct              378KB        446KB
>>>>> [121972.350000] anon_vma_chain           753KB        804KB
>>>>> [121972.360000] anon_vma                 170KB        207KB
>>>>> [121972.360000] trace_event_file          83KB         83KB
>>>>> [121972.370000] mm_struct                157KB        173KB
>>>>> [121972.370000] vmap_area                217KB        354KB
>>>>> [121972.380000] kmalloc-8k               224KB        224KB
>>>>> [121972.380000] kmalloc-4k               860KB        992KB
>>>>> [121972.390000] kmalloc-2k               352KB        352KB
>>>>> [121972.390000] kmalloc-1k               563KB        576KB
>>>>> [121972.400000] kmalloc-512              936KB        936KB
>>>>> [121972.400000] kmalloc-256              196KB        240KB
>>>>> [121972.410000] kmalloc-192              160KB        169KB
>>>>> [121972.410000] kmalloc-128              546KB        764KB
>>>>> [121972.420000] kmalloc-64              1213KB       1288KB
>>>>> [121972.420000] kmem_cache_node           12KB         12KB
>>>>> [121972.430000] kmem_cache                16KB         16KB
>>>>> [121972.440000] Tasks state (memory values in pages):
>>>>> [121972.440000] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
>>>>> [121972.450000] [    209]     0   209     5140      320        0      320         0    16384      480         -1000 systemd-udevd
>>>>> [121972.460000] [    230]   998   230     2887       55       32       23         0    18432        0             0 systemd-network
>>>>> [121972.470000] [    420]     0   420      596        0        0        0         0     6144       22             0 mdadm
>>>>> [121972.490000] [    421]   102   421     1393       56       32       24         0    10240        0             0 rpcbind
>>>>> [121972.500000] [    429]   996   429     3695       17        0       17         0    20480        0             0 systemd-resolve
>>>>> [121972.510000] [    433]     0   433      494       51        0       51         0     8192        0             0 rpc.idmapd
>>>>> [121972.520000] [    434]     0   434      743       92       33       59         0     8192        7             0 nfsdcld
>>>>> [121972.530000] [    451]     0   451      390        0        0        0         0     6144        0             0 acpid
>>>>> [121972.540000] [    453]   105   453     1380       50       32       18         0    10240       18             0 avahi-daemon
>>>>> [121972.550000] [    454]   101   454     1549       16        0       16         0    12288       32          -900 dbus-daemon
>>>>> [121972.560000] [    466]     0   466     3771       60        0       60         0    14336        0             0 irqbalance
>>>>> [121972.570000] [    475]     0   475     6269       32       32        0         0    18432        0             0 rsyslogd
>>>>> [121972.590000] [    487]   105   487     1347       68       38       30         0    10240        0             0 avahi-daemon
>>>>> [121972.600000] [    492]     0   492     1765        0        0        0         0    12288        0             0 cron
>>>>> [121972.610000] [    493]     0   493     2593        0        0        0         0    16384        0             0 wpa_supplicant
>>>>> [121972.620000] [    494]     0   494      607        0        0        0         0     8192       32             0 atd
>>>>> [121972.630000] [    506]     0   506     1065       25        0       25         0    10240        0             0 rpc.mountd
>>>>> [121972.640000] [    514]   103   514      809       25        0       25         0     8192        0             0 rpc.statd
>>>>> [121972.650000] [    522]     0   522      999       31        0       31         0    10240        0             0 agetty
>>>>> [121972.660000] [    524]     0   524     1540       28        0       28         0    12288        0             0 agetty
>>>>> [121972.670000] [    525]     0   525     9098       56       32       24         0    34816        0             0 unattended-upgr
>>>>> [121972.690000] [    526]     0   526     2621      320        0      320         0    14336      192         -1000 sshd
>>>>> [121972.700000] [    539]     0   539      849       32       32        0         0     8192        0             0 in.tftpd
>>>>> [121972.710000] [    544]   113   544     4361        6        6        0         0    16384       25             0 chronyd
>>>>> [121972.720000] [    546]     0   546    16816       62       32       30         0    45056        0             0 winbindd
>>>>> [121972.730000] [    552]     0   552    16905       59       32       27         0    45056        3             0 winbindd
>>>>> [121972.740000] [    559]     0   559    17849       94       32       30        32    49152        4             0 smbd
>>>>> [121972.750000] [    572]     0   572    17409       40       16       24         0    43008       11             0 smbd-notifyd
>>>>> [121972.760000] [    573]     0   573    17412       16       16        0         0    43008       24             0 cleanupd
>>>>> [121972.770000] [    584]     0   584     3036       20        0       20         0    16384        4             0 sshd
>>>>> [121972.780000] [    589]     0   589    16816       32        2       30         0    40960       21             0 winbindd
>>>>> [121972.790000] [    590]     0   590    27009       47       23       24         0    65536       21             0 smbd
>>>>> [121972.810000] [    597]   501   597     3344       91       32       59         0    20480        0           100 systemd
>>>>> [121972.820000] [    653]   501   653     3036        0        0        0         0    16384       33             0 sshd
>>>>> [121972.830000] [    656]   501   656     1938       93       32       61         0    12288        9             0 bash
>>>>> [121972.840000] [    704]     0   704      395      352       64      288         0     6144        0         -1000 watchdog
>>>>> [121972.850000] [    738]   501   738     2834       12        0       12         0    16384        6             0 top
>>>>> [121972.860000] [   4750]     0  4750     4218       44       26       18         0    18432       11             0 proftpd
>>>>> [121972.870000] [   4768]     0  4768      401       31        0       31         0     6144        0             0 apt.systemd.dai
>>>>> [121972.880000] [   4772]     0  4772      401       31        0       31         0     6144        0             0 apt.systemd.dai
>>>>> [121972.890000] [   4778]     0  4778    13556       54        0       54         0    59392       26             0 apt-get
>>>>> [121972.900000] Out of memory and no killable processes...
>>>>> [121972.910000] Kernel panic - not syncing: System is deadlocked on memory
>>>>> [121972.920000] CPU: 1 PID: 537 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
>>>>> [121972.920000] Hardware name: Freescale LS1024A
>>>>> [121972.930000]  unwind_backtrace from show_stack+0xb/0xc
>>>>> [121972.930000]  show_stack from dump_stack_lvl+0x2b/0x34
>>>>> [121972.940000]  dump_stack_lvl from panic+0xbf/0x264
>>>>> [121972.940000]  panic from out_of_memory+0x33f/0x34c
>>>>> [121972.950000]  out_of_memory from __alloc_pages+0x8e7/0xbb0
>>>>> [121972.950000]  __alloc_pages from __alloc_pages_bulk+0x26d/0x3d8
>>>>> [121972.960000]  __alloc_pages_bulk from svc_recv+0x9d/0x7d4
>>>>> [121972.960000]  svc_recv from nfsd+0x7d/0xd4
>>>>> [121972.970000]  nfsd from kthread+0xb9/0xcc
>>>>> [121972.970000]  kthread from ret_from_fork+0x11/0x1c
>>>>> [121972.980000] Exception stack(0xc2cadfb0 to 0xc2cadff8)
>>>>> [121972.980000] dfa0:                                     00000000 00000000 00000000 00000000
>>>>> [121972.990000] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>>>>> [121973.000000] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>>>>> [121973.010000] CPU0: stopping
>>>>> [121973.010000] CPU: 0 PID: 540 Comm: nfsd Not tainted 6.8.1+nas5xx #nas5xx
>>>>> [121973.010000] Hardware name: Freescale LS1024A
>>>>> [121973.010000]  unwind_backtrace from show_stack+0xb/0xc
>>>>> [121973.010000]  show_stack from dump_stack_lvl+0x2b/0x34
>>>>> [121973.010000]  dump_stack_lvl from do_handle_IPI+0x151/0x178
>>>>> [121973.010000]  do_handle_IPI from ipi_handler+0x13/0x18
>>>>> [121973.010000]  ipi_handler from handle_percpu_devid_irq+0x55/0x144
>>>>> [121973.010000]  handle_percpu_devid_irq from generic_handle_domain_irq+0x17/0x20
>>>>> [121973.010000]  generic_handle_domain_irq from gic_handle_irq+0x5f/0x70
>>>>> [121973.010000]  gic_handle_irq from generic_handle_arch_irq+0x27/0x34
>>>>> [121973.010000]  generic_handle_arch_irq from call_with_stack+0xd/0x10
>>>>> [121973.010000] Rebooting in 90 seconds..
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 

--
Chuck Lever


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ