[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPKjjnrYvzH8hEk9boaBt-fETX3VD2cjjN-Z6iNgwZpHqYUjWw@mail.gmail.com>
Date: Wed, 21 Feb 2024 16:20:02 +0800
From: Zhitao Li <zhitao.li@...rtx.com>
To: Trond Myklebust <trond.myklebust@...merspace.com>, Anna Schumaker <anna@...nel.org>,
Chuck Lever <chuck.lever@...cle.com>, Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
Olga Kornievskaia <kolga@...app.com>, Dai Ngo <Dai.Ngo@...cle.com>, Tom Talpey <tom@...pey.com>
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
Ping Huang <huangping@...rtx.com>
Subject: PROBLEM: NFS client IO fails with ERESTARTSYS when another mount
point with the same export is unmounted with force [NFS] [SUNRPC]
Hi, everyone,
- Facts:
I have a remote NFS export and I mount the same export on two
different directories in my OS with the same options. There is an
inflight IO under one mounted directory. And then I unmount another
mounted directory with force. The inflight IO ends up with "Unknown
error 512", which is ERESTARTSYS.
OS: Linux kernel v6.7.0
NFS mount options: vers=4.1
- My speculation:
When the same export is mounted on different directories with the same
options, superblock and sunrpc_client will be shared. Unmount with
force will kill all rpc_tasks with ERESTARTSYS in rpc_killall_tasks().
However, no signal gets involved in this case. So ERESTARTSYS is not
handled before entering user mode.
I think there are two unexpected points here:
1. The inflight IO should not fail when I unmount another directory,
though the two directories share the same export.
2. "ERESTARTSYS" should not be seen in user space. EIO may be better.
- Reproduction:
1. Prepare some NFS export, nfsd or nfs-ganesha. For example, the
export is "ip:/export_path".
2. On the latest stable mainstream Linux kernel v6.7.0, mount the
export into two different directories with the same options:
mount -t nfs -o vers=4.1 ip:/export_path /mnt/test1
mount -t nfs -o vers=4.1 ip:/export_path /mnt/test2
3. Start an inflight IO in "/mnt/test1":
dd if=/dev/urandom of=/mnt/test1/1G bs=1M count=1024 oflag=direct
4. Umount "/mnt/test2" with force when IO in step 3 is going:
umount -f /mnt/test2
5. The "dd" is expected to fail with following information:
# dd if=/dev/urandom of=/mnt/test1/1G bs=1M count=1024 oflag=direct
dd: error writing '/mnt/test1/1G': Unknown error 512
214+0 records in
213+0 records out
223346688 bytes (223 MB, 213 MiB) copied, 7.87017 s, 28.4 MB/s.
- Helpful links
1. v6.7.0 rpc_killall_tasks():
https://elixir.bootlin.com/linux/v6.7/source/net/sunrpc/clnt.c#L869
2. COMMIT "SUNRPC: Fix up task signalling v5.2-rc1" changes the error
code of rpc_tasks in rpc_killall_tasks() from EIO to ERESTARTSYS. The
link is https://github.com/torvalds/linux/commit/ae67bd3821bb0a54d97e7883d211196637d487a9?diff=split&w=0
Looking forward to your early reply :)
Best regards,
Zhitao Li, in SmartX.
Powered by blists - more mailing lists