lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <05cd01d5b43f$b7d88f60$2789ae20$@gmail.com>
Date:   Mon, 16 Dec 2019 18:36:23 -0000
From:   "Robert Milkowski" <rmilkowski@...il.com>
To:     <linux-nfs@...r.kernel.org>
Cc:     "'Trond Myklebust'" <trond.myklebust@...merspace.com>,
        "'Anna Schumaker'" <anna.schumaker@...app.com>,
        <linux-kernel@...r.kernel.org>, <linux-nfs@...r.kernel.org>
Subject: RE: [PATCH] NFSv4: nfs4_do_fsinfo() should not do implicit lease renewals

Hi,

If a sub-filesystem (nfsv4 mirror mount) gets unmounted and then mounted
again (by accessing it) the nfs4_do_fsinfo() function is called,
which currently assumes implicit lease renewal. I believe this is no
compliant with the RFC.

I've managed to trigger the issue by two different methods:

1) in prod

If there is an NFSv4 filesystem mounted with sub-mounts (similar setup as
below), after nfs_mountpoint_expiry_timeout of inactivity, 
Each submount will be unmounted. If now it gets accessed again it will be
automatically mounted again which will result currently
in implicit lease renewal on the client side which in turn can result in a
relatively small window where the client thinks its lease
is still valid while an nfs server has already expired the lease.

2) manual unmount

# cat /etc/exports
/ *(rw,sync)
/var *(rw,sync)


On a Linux NFS client:

# mount -o vers=4 10.50.2.59:/ /mnt/3
$ head /mnt/3/var/log/vmware-vmsvc.log >/dev/null
$ df -h | tail -2
10.50.2.59:/                                                  29G   25G
2.3G  92% /mnt/3
10.50.2.59:/var                                               20G  2.2G
16G  13% /mnt/3/var

# while [ 1 ]; do date; umount /mnt/3/var; ls /mnt/3/var >/dev/null; sleep
10; done
...

By constantly unmounting the sub-filesystem (/var) and then accessing it so
it gets mounted again (which triggers the nfs4_do_fsinfo()),
the cl_last_renewal is set to now on the client which prevents RENEW
operations from being send, and eventually the NFS server
will expire the lease (common defaults are 60s on Linux and 90s on Solaris
servers).

In testing I confirmed that both Linux and Solaris NFSv4 servers will not do
an implicit lease renewal in this case
(nfs4_do_fsinfo() results in GETATTR operations being send), in which case
the lease might expire and both Linux and Solaris
NFS servers will return NFS4ERR_EXPIRED.

The error is not handled correctly either and will result in EIO propagated
to an application issuing open().
See my other email with subject: [PATCH] NFSv4: open() should try lease
recovery on NFS4ERR_EXPIRED
which contains a fix for the NFS4ERR_EXPIRED handling.

This patch however fixes the issue with implicit lease renewal by not
setting the cl_last_renewal to now,
unless there is no lease set yet.

This was tested with 5.5.0-rc2 and the provided patch is applied on top of
the 5.5.0-rc2 as well.


btw: Recent ONTAP versions return NFS4ERR_STALE_CLIENTID which is handled
correctly - Linux client will try to renew its lease and if successful it
will retry open().
     


Best regards,
 Robert Milkowski


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ