[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ec156ccf-9d6c-44be-bd1d-5f3e9b166e77@talpey.com>
Date: Wed, 22 Jan 2025 10:35:32 -0500
From: Tom Talpey <tom@...pey.com>
To: Chuck Lever <chuck.lever@...cle.com>, Jeff Layton <jlayton@...nel.org>,
Neil Brown <neilb@...e.de>, Olga Kornievskaia <okorniev@...hat.com>,
Dai Ngo <Dai.Ngo@...cle.com>
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfsd: always handle RPC_SIGNALLED earlier in
nfsd4_cb_sequence_done()
On 1/22/2025 10:20 AM, Chuck Lever wrote:
> On 1/22/25 10:10 AM, Jeff Layton wrote:
>> The v4.0 client always restarts the callback when the connection is shut
>> down (which is indicated by RPC_SIGNALLED()). The RPC is then requeued
>> and the result eventually should complete (or be aborted).
>>
>> The v4.1 code instead processes the result and only later decides to
>> restart the call. Even more problematic is the fact that it releases the
>> slot beforehand. The restarted call may get a new slot, which would
>> could break DRC handling.
>
> "break DRC handling" -- I'd like to understand this.
>
> NFSD always sets cachethis to false in CB_SEQUENCE, so there is no DRC
> for these operations. The only thing the client saves is the slot
> sequence number IIUC.
>
> Is retrying an uncached operation via a different slot a problem?
The server would ignore any in-progress status and therefore might
cause two requests to be processed in parallel. It might also
rearrange the order of replies. Other than those, maybe not!
I do think it's safest to not reallocate the slot in the 4.1 case.
The slot sequence number is only meaningful in the context of the
original slot.
Tom.
>> Change nfsd4_cb_sequence_done() such that RPC_SIGNALLED() is handled the
>> same way irrespective of the minorversion. Check it early, before
>> processing the CB_SEQUENCE result, and immediately restart the call.
>
> It would be lovely to have some test cases that exercise the server's
> backchannel retry logic.
>
>
>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>> ---
>> fs/nfsd/nfs4callback.c | 32 ++++++++++++++++----------------
>> 1 file changed, 16 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>> index
>> 50e468bdb8d4838b5217346dcc2bd0fec1765c1a..05afdf94c6478c7d698b059fa33dd9e7af49b91e 100644
>> --- a/fs/nfsd/nfs4callback.c
>> +++ b/fs/nfsd/nfs4callback.c
>> @@ -1334,21 +1334,24 @@ static bool nfsd4_cb_sequence_done(struct
>> rpc_task *task, struct nfsd4_callback
>> struct nfsd4_session *session = clp->cl_cb_session;
>> bool ret = true;
>> - if (!clp->cl_minorversion) {
>> - /*
>> - * If the backchannel connection was shut down while this
>> - * task was queued, we need to resubmit it after setting up
>> - * a new backchannel connection.
>> - *
>> - * Note that if we lost our callback connection permanently
>> - * the submission code will error out, so we don't need to
>> - * handle that case here.
>> - */
>> - if (RPC_SIGNALLED(task))
>> - goto need_restart;
>> + /*
>> + * If the backchannel connection was shut down while this
>> + * task was queued, resubmit it after setting up a new backchannel
>> + * connection.
>> + *
>> + * If the backchannel connection is permanently lost, the submission
>> + * code will error out, so there is no need to handle that case
>> here.
>> + *
>> + * For the v4.1+ case, do this without releasing the slot or doing
>> + * any further processing. It's possible that the restarted call
>> will
>> + * be a retransmission of an earlier call to the server and that
>> will
>> + * help ensure that the DRC works correctly.
>> + */
>> + if (RPC_SIGNALLED(task))
>> + goto need_restart;
>> + if (!clp->cl_minorversion)
>> return true;
>> - }
>> if (cb->cb_held_slot < 0)
>> goto need_restart;
>> @@ -1403,9 +1406,6 @@ static bool nfsd4_cb_sequence_done(struct
>> rpc_task *task, struct nfsd4_callback
>> }
>> trace_nfsd_cb_free_slot(task, cb);
>> nfsd41_cb_release_slot(cb);
>> -
>> - if (RPC_SIGNALLED(task))
>> - goto need_restart;
>> out:
>> return ret;
>> retry_nowait:
>>
>> ---
>> base-commit: a8acd0de47f22619d62d548b86bcfc9a4de2b2c6
>> change-id: 20250122-nfsd-6-14-77c1ad5d9f01
>>
>> Best regards,
>
>
Powered by blists - more mailing lists