[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8753ffaf-0146-45c6-9189-bd8ea3e74e71@oracle.com>
Date: Wed, 30 Jul 2025 16:58:15 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: Olga Kornievskaia <okorniev@...hat.com>, jlayton@...nel.org,
trondmy@...merspace.com, anna.schumaker@...cle.com, hch@....de,
sagi@...mberg.me, kch@...dia.com, davem@...emloft.net,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com
Cc: linux-nfs@...r.kernel.org, linux-nvme@...ts.infradead.org,
netdev@...r.kernel.org, kernel-tls-handshake@...ts.linux.dev,
neil@...wn.name, Dai.Ngo@...cle.com, tom@...pey.com, hare@...e.de,
horms@...nel.org, kbusch@...nel.org
Subject: Re: [PATCH 1/4] sunrpc: fix handling of server side tls alerts
On 7/30/25 4:08 PM, Olga Kornievskaia wrote:
> Scott Mayhew discovered a security exploit in NFS over TLS in
> tls_alert_recv() due to its assumption it can read data from
> the msg iterator's kvec..
>
> kTLS implementation splits TLS non-data record payload between
> the control message buffer (which includes the type such as TLS
> aler or TLS cipher change) and the rest of the payload (say TLS
> alert's level/description) which goes into the msg payload buffer.
>
> This patch proposes to rework how control messages are setup and
> used by sock_recvmsg().
>
> If no control message structure is setup, kTLS layer will read and
> process TLS data record types. As soon as it encounters a TLS control
> message, it would return an error. At that point, NFS can setup a
> kvec backed msg buffer and read in the control message such as a
> TLS alert. Scott found that msg iterator can advance the kvec
> pointer as a part of the copy process thus we need to revert the
> iterator before calling into the tls_alert_recv.
>
> Reported-by: Scott Mayhew <smayhew@...hat.com>
> Fixes: 5e052dda121e ("SUNRPC: Recognize control messages in server-side TCP socket code")
> Suggested-by: Trond Myklebust <trondmy@...merspace.com>
> Suggested-by: Scott Mayhew <smayhew@...hat.com>
> Signed-off-by: Olga Kornievskaia <okorniev@...hat.com>
> ---
> net/sunrpc/svcsock.c | 43 +++++++++++++++++++++++++++++++++++--------
> 1 file changed, 35 insertions(+), 8 deletions(-)
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 46c156b121db..e2c5e0e626f9 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -257,20 +257,47 @@ svc_tcp_sock_process_cmsg(struct socket *sock, struct msghdr *msg,
> }
>
> static int
> -svc_tcp_sock_recv_cmsg(struct svc_sock *svsk, struct msghdr *msg)
> +svc_tcp_sock_recv_cmsg(struct socket *sock, unsigned int *msg_flags)
> {
> union {
> struct cmsghdr cmsg;
> u8 buf[CMSG_SPACE(sizeof(u8))];
> } u;
> - struct socket *sock = svsk->sk_sock;
> + u8 alert[2];
> + struct kvec alert_kvec = {
> + .iov_base = alert,
> + .iov_len = sizeof(alert),
> + };
> + struct msghdr msg = {
> + .msg_flags = *msg_flags,
> + .msg_control = &u,
> + .msg_controllen = sizeof(u),
> + };
> + int ret;
> +
> + iov_iter_kvec(&msg.msg_iter, ITER_DEST, &alert_kvec, 1,
> + alert_kvec.iov_len);
> + ret = sock_recvmsg(sock, &msg, MSG_DONTWAIT);
> + if (ret > 0 &&
> + tls_get_record_type(sock->sk, &u.cmsg) == TLS_RECORD_TYPE_ALERT) {
> + iov_iter_revert(&msg.msg_iter, ret);
> + ret = svc_tcp_sock_process_cmsg(sock, &msg, &u.cmsg, -EAGAIN);
> + }
> + return ret;
> +}
> +
> +static int
> +svc_tcp_sock_recvmsg(struct svc_sock *svsk, struct msghdr *msg)
> +{
> int ret;
> + struct socket *sock = svsk->sk_sock;
>
> - msg->msg_control = &u;
> - msg->msg_controllen = sizeof(u);
> ret = sock_recvmsg(sock, msg, MSG_DONTWAIT);
> - if (unlikely(msg->msg_controllen != sizeof(u)))
> - ret = svc_tcp_sock_process_cmsg(sock, msg, &u.cmsg, ret);
> + if (msg->msg_flags & MSG_CTRUNC) {
> + msg->msg_flags &= ~(MSG_CTRUNC | MSG_EOR);
> + if (ret == 0 || ret == -EIO)
> + ret = svc_tcp_sock_recv_cmsg(sock, &msg->msg_flags);
> + }
> return ret;
> }
>
> @@ -321,7 +348,7 @@ static ssize_t svc_tcp_read_msg(struct svc_rqst *rqstp, size_t buflen,
> iov_iter_advance(&msg.msg_iter, seek);
> buflen -= seek;
> }
> - len = svc_tcp_sock_recv_cmsg(svsk, &msg);
> + len = svc_tcp_sock_recvmsg(svsk, &msg);
> if (len > 0)
> svc_flush_bvec(bvec, len, seek);
>
> @@ -1018,7 +1045,7 @@ static ssize_t svc_tcp_read_marker(struct svc_sock *svsk,
> iov.iov_base = ((char *)&svsk->sk_marker) + svsk->sk_tcplen;
> iov.iov_len = want;
> iov_iter_kvec(&msg.msg_iter, ITER_DEST, &iov, 1, want);
> - len = svc_tcp_sock_recv_cmsg(svsk, &msg);
> + len = svc_tcp_sock_recvmsg(svsk, &msg);
> if (len < 0)
> return len;
> svsk->sk_tcplen += len;
Fwiw, I've already pulled 1/4 into nfsd-testing. But 4/4 might not apply
until the others are in the tree. We might want these to go through a
single tree.
Or, can we delay 4/4 until 6.18 ?
--
Chuck Lever
Powered by blists - more mailing lists