linux-kernel - [RFC PATCH v1 0/7] nvme-tcp: Implement TP4129 (KATO Corrections and Clarifications)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250324174909.3919131-1-mkhalfella@purestorage.com>
Date: Mon, 24 Mar 2025 10:48:53 -0700
From: Mohamed Khalfella <mkhalfella@...estorage.com>
To: Christoph Hellwig <hch@....de>,
	Sagi Grimberg <sagi@...mberg.me>,
	Keith Busch <kbusch@...nel.org>
Cc: Hannes Reinecke <hare@...e.de>,
	Daniel Wagner <wagi@...nel.org>,
	John Meneghini <jmeneghi@...hat.com>,
	randyj@...estorage.com,
	adailey@...estorage.com,
	jrani@...estorage.com,
	linux-nvme@...ts.infradead.org,
	linux-kernel@...r.kernel.org,
	mkhalfella@...estorage.com
Subject: [RFC PATCH v1 0/7] nvme-tcp: Implement TP4129 (KATO Corrections and Clarifications)

Hello,

RFC patchset implementing TP4129 (KATO Corrections and Clarifications)
for nvme-tcp. nvme-tcp was choosen as an example to demonstrate the
approach taken in the patchset. Other fabric transports, nvme-rdma and
nvme-fc, will be added including the feedback received in this RFC.

TP4129 requires nvme controller to not immediately cancel inflight
requests when connection is lost between initiator and target.
Instead, inflight requests need to be held for a duration that is long
enough to allow target to learn about connection loss and quiesce
pending commands. Only then pending requests on the initiator side can
be canceled and possibly retried safely on another path. The main issue
TP4129 tries to address is ABA corruption that could happen if inflight
requests are tried immediately on another path.

Requests hold timeout has two components:

- KATO timeout is the time sufficient for target to learn about the
  connection loss. It depends on whether command based or traffic based
  keepalive is used. As per TP4129 the timeout is supposed to be 3 x KATO
  for traffic based keepalive and 2 * KATO for command based keepalive.

- CQT is the time needed by target controller to quiesce in flight nvme
  commands after the controller learns about connection loss.

On controller reset or delete cancel inflight requests if controller was
disabled correctly. Otherwise, hold the requests until it is safe to be
released.

Jyoti Rani (1):
  nvme-core: Read CQT wait from identify controller response

Mohamed Khalfella (6):
  nvmef: Add nvmef_req_hold_timeout_ms() to calculate kato request hold
    time
  nvme-tcp: Move freeing tagset out of nvme_tcp_teardown_io_queues()
  nvme-tcp: Move freeing admin tagset out of
    nvme_tcp_teardown_admin_queue()
  nvme-tcp: Split nvme_tcp_teardown_io_queues() into two functions
  nvme-core: Add support for holding inflight requests
  nvme-tcp: Do not immediately cancel inflight requests during recovery

 drivers/nvme/host/core.c    | 62 +++++++++++++++++++++++++++++++++++++
 drivers/nvme/host/fabrics.h |  7 +++++
 drivers/nvme/host/nvme.h    |  5 +++
 drivers/nvme/host/tcp.c     | 56 +++++++++++++++++++++------------
 include/linux/nvme.h        |  4 ++-
 5 files changed, 114 insertions(+), 20 deletions(-)

-- 
2.48.1