lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9255749-bf1e-c498-ace6-048d36fa962f@gmail.com>
Date:   Wed, 23 Aug 2023 18:14:03 +0700
From:   Bagas Sanjaya <bagasdotme@...il.com>
To:     Chuck Lever <chuck.lever@...cle.com>,
        Jeff Layton <jlayton@...nel.org>,
        Trond Myklebust <trond.myklebust@...merspace.com>,
        Anna Schumaker <anna@...nel.org>, greg@...g.net.au
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux Regressions <regressions@...ts.linux.dev>,
        Linux Network File System <linux-nfs@...r.kernel.org>
Subject: Fwd: kernel 6.4/6.5 nfs 4.1 unresponsive

Hi,

I notice a regression report on Bugzilla [1]. Quoting from it:

> I have two Synology Disk station NAS devices with NFS mounts present on Gentoo servers with the following fstab mount configuration:
> 
> 10.200.1.247:/volume1/filer02-sata      /mnt/filer02-sata       nfs     vers=4.1,tcp,rsize=32768,wsize=32768,nolock,noatime,nodiratime,hard,timeo=60,retry=6,retrans=6,nconnect=4 0 0
> 10.200.1.247:/volume1/filer03-sata      /mnt/filer03-sata       nfs     vers=4.1,tcp,rsize=32768,wsize=32768,nolock,noatime,nodiratime,hard,timeo=60,retry=6,retrans=6,nconnect=4 0 0
> 10.200.1.246:/volume1/filer04-sata      /mnt/filer04-sata       nfs     vers=4.1,tcp,rsize=32768,wsize=32768,nolock,noatime,nodiratime,hard,timeo=60,retry=6,retrans=6,nconnect=4 0 0
> 
> 
> On Linux Kernel 6.3.6 these work perfectly fine.
> 
> As soon as I upgrade to 6.4 (tested 6.4.7 through 6.4.11) or 6.5-rc7 NFS mounts randomly hang and block system operation with high load times eventually resulting in a system freeze.
> 
> dmesg/syslog:
> 
> Aug 22 18:13:49 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:13:49 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:13:49 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:13:49 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:14:35 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:15:23 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:05 sjc-www2 kernel: nfs: server 10.200.1.247 not responding, still trying
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> Aug 22 18:16:54 sjc-www2 kernel: nfs: server 10.200.1.247 OK
> 
> 
> The box in question i have been testing the kernel upgrades on has 1 x 10G NIC set with MTU 9000 for NFS volumes and i can successfully ping the nfs host with 9000 byte packets:
> 
> sjc-www2 ~ # ping -4 -s 9000 10.200.1.247
> PING 10.200.1.247 (10.200.1.247) 9000(9028) bytes of data.
> 9008 bytes from 10.200.1.247: icmp_seq=1 ttl=64 time=0.205 ms
> 9008 bytes from 10.200.1.247: icmp_seq=2 ttl=64 time=0.279 ms
> 9008 bytes from 10.200.1.247: icmp_seq=3 ttl=64 time=0.402 ms

See Bugzilla for the full thread.

Anyway, I'm adding this regression to be tracked by regzbot:

#regzbot introduced: v6.3..v6.4 https://bugzilla.kernel.org/show_bug.cgi?id=217815
#regzbot title: nfs server not responding loop on Synology NAS devices

Thanks.

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=217815

-- 
An old man doll... just what I always wanted! - Clara

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ