[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d78576c1-d743-4ec2-bf8c-d87603460ac1@oracle.com>
Date: Thu, 20 Mar 2025 09:16:15 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: njha@...estreet.com, Trond Myklebust <trondmy@...nel.org>,
Anna Schumaker <anna@...nel.org>, Jeff Layton <jlayton@...nel.org>,
Neil Brown <neilb@...e.de>, Olga Kornievskaia <okorniev@...hat.com>,
Dai Ngo <Dai.Ngo@...cle.com>, Tom Talpey <tom@...pey.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/2] fix gss seqno handling to be more rfc-compliant
On 3/19/25 1:02 PM, Nikhil Jha via B4 Relay wrote:
> When the client retransmits an operation (for example, because the
> server is slow to respond), a new GSS sequence number is associated with
> the XID. In the current kernel code the original sequence number is
> discarded. Subsequently, if a response to the original request is
> received there will be a GSS sequence number mismatch. A mismatch will
> trigger another retransmit, possibly repeating the cycle, and after some
> number of failed retries EACCES is returned.
>
> RFC2203, section 5.3.3.1 suggests a possible solution... “cache the
> RPCSEC_GSS sequence number of each request it sends” and "compute the
> checksum of each sequence number in the cache to try to match the
> checksum in the reply's verifier." This is what FreeBSD’s implementation
> does (rpc_gss_validate in sys/rpc/rpcsec_gss/rpcsec_gss.c).
>
> However, even with this cache, retransmits directly caused by a seqno
> mismatch can still cause a bad message interleaving that results in this
> bug. The RFC already suggests ignoring incorrect seqnos on the server
> side, and this seems symmetric, so this patchset also applies that
> behavior to the client.
>
> These two patches are *not* dependent on each other. I tested them by
> delaying packets with a Python script hooked up to NFQUEUE. If it would
> be helpful I can send this script along as well.
>
> Signed-off-by: Nikhil Jha <njha@...estreet.com>
> ---
> Changes since v1:
> * Maintain the invariant that the first seqno is always first in
> rq_seqnos, so that it doesn't need to be stored twice.
> * Minor formatting, and resending with proper mailing-list headers so the
> patches are easier to work with.
>
> ---
> Nikhil Jha (2):
> sunrpc: implement rfc2203 rpcsec_gss seqnum cache
> sunrpc: don't immediately retransmit on seqno miss
>
> include/linux/sunrpc/xprt.h | 17 +++++++++++-
> include/trace/events/rpcgss.h | 4 +--
> include/trace/events/sunrpc.h | 2 +-
> net/sunrpc/auth_gss/auth_gss.c | 59 ++++++++++++++++++++++++++----------------
> net/sunrpc/clnt.c | 9 +++++--
> net/sunrpc/xprt.c | 3 ++-
> 6 files changed, 64 insertions(+), 30 deletions(-)
> ---
> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
> change-id: 20250314-rfc2203-seqnum-cache-52389d14f567
>
> Best regards,
This seems like a sensible thing to do to me.
Acked-by: Chuck Lever <chuck.lever@...cle.com>
--
Chuck Lever
Powered by blists - more mailing lists