[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9ea74200-7cbc-4a30-9503-864dcec9b45d@suse.de>
Date: Mon, 27 Jan 2025 08:37:36 +0100
From: Hannes Reinecke <hare@...e.de>
To: Caleb Sander Mateos <csander@...estorage.com>,
Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>
Cc: Maurizio Lombardi <mlombard@...hat.com>, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] nvme-tcp: fix connect failure on receiving partial
ICResp PDU
On 1/24/25 19:43, Caleb Sander Mateos wrote:
> nvme_tcp_init_connection() attempts to receive an ICResp PDU but only
> checks that the return value from recvmsg() is non-negative. If the
> sender closes the TCP connection or sends fewer than 128 bytes, this
> check will pass even though the full PDU wasn't received.
>
> Ensure the full ICResp PDU is received by checking that recvmsg()
> returns the expected 128 bytes.
>
> Additionally set the MSG_WAITALL flag for recvmsg(), as a sender could
> split the ICResp over multiple TCP frames. Without MSG_WAITALL,
> recvmsg() could return prematurely with only part of the PDU.
>
> Signed-off-by: Caleb Sander Mateos <csander@...estorage.com>
> Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
> ---
> v4: keep recvmsg() error return value
> v3: fix return value to indicate error
> v2: add Fixes tag
>
> drivers/nvme/host/tcp.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index e9ff6babc540..56679eb8c0d6 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1446,15 +1446,18 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
> iov.iov_len = sizeof(*icresp);
> if (nvme_tcp_queue_tls(queue)) {
> msg.msg_control = cbuf;
> msg.msg_controllen = sizeof(cbuf);
> }
> + msg.msg_flags = MSG_WAITALL;
> ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
> iov.iov_len, msg.msg_flags);
But won't we have to wait for a TCP timeout now if the sender sends less
than 128 bytes? With this patch we always wait for 128 bytes, and
possibly wait for TCP timeout if not.
Testcase for this would be nice ...
And I need to check if secure concatenation is affected here; with
secure concatenation we need to peek at the first packet to check
if it's an ICRESP or a TLS negotiation.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists