[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <850dcbf562b7eb5848278937092d2d8511eb648f.camel@kernel.org>
Date: Mon, 11 Aug 2025 09:03:49 -0400
From: Jeff Layton <jlayton@...nel.org>
To: "zhangjian (CG)" <zhangjian496@...wei.com>, Trond Myklebust
<trondmy@...nel.org>, anna@...nel.org
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Question]nfs: never returned delegation
On Mon, 2025-08-11 at 20:48 +0800, zhangjian (CG) wrote:
> Recently, we meet a NFS problem in 5.10. There are so many test_state_id request after a non-privilaged request in tcpdump result. There are 40w+ delegations in client (I read the delegation list from /proc/kcore).
> Firstly, I think state manager cost a lot in nfs_server_reap_expired_delegations. But I see they are all in NFS_DELEGATION_REVOKED state except 6 in NFS_DELEGATION_REFERENCED (I read this from /proc/kcore too).
> I analyze NFS code and find if NFSPROC4_CLNT_DELEGRETURN procedure meet ETIMEOUT, delegation will be marked as NFS4ERR_DELEG_REVOKED and never return it again. NFS server will keep the revoked delegation in clp->cl_revoked forever. This will result in following sequence response with RECALLABLE_STATE_REVOKED flag. Client will send test_state_id request for all non-revoked delegation.
> This can only be solved by restarting NFS server.
> I think ETIMEOUT in NFSPROC4_CLNT_DELEGRETURN procedure may be not the only case that cause lots of non-terminable test_state_id requests after any non-privilaged request.
> Wish NFS experts give some advices on this problem.
>
What should happen is that the client should issue a TEST_STATEID and
then follow up with a FREE_STATEID once it's clear that it has been
revoked. Alternately, if the client expires then the server will purge
any state it held at that point. The server is required to keep a
record of these objects until one of those events occurs.
v5.10 is pretty old, and there have been a number of fixes in this area
in both the client and server over the last several years. You may want
to try a newer kernel (or look at doing some backporting).
Cheers,
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists