[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17d0b30a-758c-4a84-9879-0a070656f15e@suse.de>
Date: Mon, 3 Feb 2025 08:20:08 +0100
From: Hannes Reinecke <hare@...e.de>
To: Caleb Sander <csander@...estorage.com>, Sagi Grimberg <sagi@...mberg.me>
Cc: Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Christoph Hellwig <hch@....de>, Maurizio Lombardi <mlombard@...hat.com>,
linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] nvme-tcp: fix connect failure on receiving partial
ICResp PDU
On 2/3/25 01:19, Caleb Sander wrote:
> On Fri, Jan 31, 2025 at 12:17 AM Sagi Grimberg <sagi@...mberg.me> wrote:
> ...
>> Caleb, can you please make sure to test this patch with TLS?
>
> Can you point me to some documentation for that? I tried setting up a
> nvmet_tcp port to use TLS and connecting to it as described in the
> cover letter for the patch series (https://lwn.net/Articles/941139/).
> But the TLS Server Hello seems to fail with EACCES. Any idea what I'm
> doing wrong?
> $ modprobe nvmet_tcp
> $ modprobe null_blk nr_devices=1
> $ mkdir /sys/kernel/config/nvmet/subsystems/nqn.nvmet
> $ echo 1 > /sys/kernel/config/nvmet/subsystems/nqn.nvmet/attr_allow_any_host
> $ mkdir /sys/kernel/config/nvmet/subsystems/nqn.nvmet/namespaces/1
> $ echo /dev/nullb0 >
> /sys/kernel/config/nvmet/subsystems/nqn.nvmet/namespaces/1/device_path
> $ echo 1 > /sys/kernel/config/nvmet/subsystems/nqn.nvmet/namespaces/1/enable
> $ mkdir /sys/kernel/config/nvmet/ports/1
> $ echo tcp > /sys/kernel/config/nvmet/ports/1/addr_trtype
> $ echo ipv4 > /sys/kernel/config/nvmet/ports/1/addr_adrfam
> $ echo 127.0.0.1 > /sys/kernel/config/nvmet/ports/1/addr_traddr
> $ echo 4420 > /sys/kernel/config/nvmet/ports/1/addr_trsvcid
> $ echo required > /sys/kernel/config/nvmet/ports/1/addr_treq
> $ echo tls1.3 > /sys/kernel/config/nvmet/ports/1/addr_tsas
> $ ln -s /sys/kernel/config/nvmet/subsystems/nqn.nvmet
> /sys/kernel/config/nvmet/ports/1/subsystems
> $ nvme gen-tls-key --subsysnqn nqn.nvmet --insert
> Inserted TLS key 005e8a74
> $ nvme gen-tls-key --subsysnqn nqn.2014-08.org.nvmexpress.discovery --insert
> Inserted TLS key 22d676b8
> $ tlshd &
> $ nvme discover --transport tcp --traddr 127.0.0.1 --trsvcid 4420 --tls
> Failed to write to /dev/nvme-fabrics: Input/output error
> failed to add controller, error failed to write to nvme-fabrics device
>
> With debug logs enabled, I see the following:
> $ dmesg | tail -6
> [ 440.405298] nvme nvme0: connecting queue 0
> [ 440.405403] nvmet_tcp: queue 0: TLS ServerHello
> [ 440.405433] nvme nvme0: queue 0: start TLS with key 11b456f9
> [ 440.407836] nvmet_tcp: queue 0: TLS handshake done, key 0, status -13
> [ 440.422881] nvme nvme0: queue 0: TLS handshake done, key 0, status -13
> [ 440.422932] nvme nvme0: queue 0: TLS handshake complete, error 13
>
> A tcpdump shows the host sending a TLS Client Hello packet and the
> target immediately closing the connection.
>
The PSK identification has to contain the host NQN _and_ the target NQN.
So you need to call gen-tls-key with both in order to generate a PSK
with an identitify which can be found for a connection attempt.
>> Do you have a reliable way to reproduce this?
>
> Sure, here's a fake Python NVMe/TCP controller that immediately closes
> each connection after receiving the ICReq PDU:
> ```
> import socket
>
> def recv_all(socket, length):
> result = b''
> while len(result) < length:
> result += socket.recv(length - len(result))
> return result
>
> listen_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> listen_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
> listen_sock.bind(('', 4420))
> listen_sock.listen()
> while True:
> client_sock, _ = listen_sock.accept()
> recv_all(client_sock, 128) # ICReq
> client_sock.close()
> ```
> Attempting to connect to it reports an error about the ICResp PDU type
> field even though no ICResp was sent:
> $ nvme connect --transport tcp --traddr 192.168.1.12 --nqn nqn.abc
> Failed to write to /dev/nvme-fabrics: Invalid argument
> could not add new controller: invalid arguments/configuration
> $ dmesg | tail -1
> [1351639.614853] nvme_tcp: queue 0: bad type returned 0
>
> Here's a valid scenario where the controller sends the ICResp Common
> Header and PDU Specific Header separately (using TCP_NODELAY to ensure
> the sends are not coalesced):
> ```
> import socket
>
> def recv_all(socket, length):
> result = b''
> while len(result) < length:
> result += socket.recv(length - len(result))
> return result
>
> def send_all(socket, data):
> while data:
> data = data[socket.send(data):]
>
> listen_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> listen_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
> listen_sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
> listen_sock.bind(('', 4420))
> listen_sock.listen()
> while True:
> client_sock, _ = listen_sock.accept()
> recv_all(client_sock, 128) # ICReq
> common_header = bytes([
> 0x01, # PDU-type
> 0, # FLAGS
> 128, # HLEN
> 0, # PDO
> 128, 0, 0, 0, # PLEN
> ])
> send_all(client_sock, common_header)
> ic_resp = bytes([
> 0, 0, # PFV
> 0, # CPDA
> 0, # DGST
> 0xFC, 0xFF, 0xFF, 0xFF, # MAXH2CDATA
> ] + [0] * 112)
> send_all(client_sock, ic_resp)
> client_sock.close()
> ```
> The host refuses to connect, complaining that the MAXH2CDATA field in
> the ICResp is invalid. But that is because it only received the Common
> Header.
> $ nvme connect --transport tcp --traddr 192.168.1.12 --nqn nqn.abc
> Failed to write to /dev/nvme-fabrics: Invalid argument
> could not add new controller: invalid arguments/configuration
> $ dmesg | tail -1
> [1351960.082011] nvme_tcp: queue 0: invalid maxh2cdata returned 0
>
> With the patch applied, the controller closing the connection without
> sending a ICResp PDU correctly results in a "Connection reset by peer"
> error:
> $ nvme connect --transport tcp --traddr 192.168.2343.066666666671.12 --nqn nqn.abc
> Failed to write to /dev/nvme-fabrics: Connection reset by peer
> could not add new controller: failed to write to nvme-fabrics device
> $ dmesg | tail -1
> [ 450.050463] nvme_tcp: queue 0: failed to receive icresp, error 0
>
> And when the controller sends the Common Header separately from the
> PDU Specific Header, nvme_tcp_init_connection() now succeeds. The
> connection attempt instead times out waiting for a response to the
> Fabrics Connect command, since the fake NVMe/TCP controller doesn't
> implement that:
> $ nvme connect --transport tcp --traddr 192.168.1.12 --nqn nqn.abc
> Failed to write to /dev/nvme-fabrics: Input/output error
> could not add new controller: failed to write to nvme-fabrics device
> $ dmesg | tail -3
> [ 644.728894] nvme nvme0: I/O tag 0 (0000) type 4 opcode 0x7f
> (Connect) QID 0 timeout
> [ 644.728974] nvme nvme0: Connect command failed, error wo/DNR bit: 881
> [ 644.728999] nvme nvme0: failed to connect queue: 0 ret=881
>
Thanks.
I'll see to give it a spin.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@...e.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
Powered by blists - more mailing lists