lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251126021250.2583630-1-mkhalfella@purestorage.com>
Date: Tue, 25 Nov 2025 18:11:47 -0800
From: Mohamed Khalfella <mkhalfella@...estorage.com>
To: Chaitanya Kulkarni <kch@...dia.com>,
	Christoph Hellwig <hch@....de>,
	Jens Axboe <axboe@...nel.dk>,
	Keith Busch <kbusch@...nel.org>,
	Sagi Grimberg <sagi@...mberg.me>
Cc: Aaron Dailey <adailey@...estorage.com>,
	Randy Jennings <randyj@...estorage.com>,
	John Meneghini <jmeneghi@...hat.com>,
	Hannes Reinecke <hare@...e.de>,
	linux-nvme@...ts.infradead.org,
	linux-kernel@...r.kernel.org,
	Mohamed Khalfella <mkhalfella@...estorage.com>
Subject: [RFC PATCH 00/14] TP8028 Rapid Path Failure Recovery

This patchset adds support for TP8028 Rapid Path Failure Recovery for
both nvme target and initiator. Rapid Path Failure Recovery brings
Cross-Controller Reset (CCR) functionality to nvme. This allows nvme
host to send an nvme command to source nvme controller to reset impacted
nvme controller. Provided that both source and impacted controllers are
in the same nvme subsystem.

The main use of CCR is when one path to nvme subsystem fails. Inflight
IOs on impacted nvme controller need to be terminated first before they
can be retried on another path. Otherwise data corruption may happen.
CCR provides a quick way to terminate these IOs on the unreachable nvme
controller allowing recovery to move quickly and avoiding unnecessary
delays. In case of CCR is not possible, then inflight requests are held
for duration defined by TP4129 KATO Corrections and Clarifications
before they are allowed to be retried.


On the target side:

- New struct members have been added to support CCR. struct nvme_id_ctrl
  has been updated with CIU (Controller Instance Uniquifier), CIRN
  (Controller Instance Random Number), and CQT (Command Quiesce Time).
  The combination of CIU, CNTLID, and CIRN is used to identify impacted
  controller in CCR command.

- CCR nvme command implemented on the target causes impacted controller
  to fail and drop connections to host.

- CCR logpage contains the status of pending CCR requests. An entry is
  added to the logpage after CCR request is validated. Completed CCR
  requests are removed from the logpage when controller becomes ready or
  when requested in get logpage command.

- An AEN is sent when CCR completes to let the host know that it is safe
  to retry inflight requests.


On the host side:

- CIU, CIRN, and CQT have been added to struct nvme_ctrl. CIU and CIRN
  have been added to sysfs to make the values visible to user. CIU and
  CIRN can be used to construct and manually send admin-passthru CCR
  commands.

- New controller state NVME_CTRL_RECOVERING has been added to prevent
  cancelling timed out inflight requests while CCR is in progress.
  Controller flag NVME_CTRL_RECOVERED was also added to signal end of
  time-based recovery.

- Controller recovery in nvme_recover_ctrl() is invoked when LIVE
  controller hits an error or when a request times out. CCR is attempted
  to reset impacted controller.

- Updated nvme fabric transports nvme-tcp, nvme-rdma, and nvme-fc to use
  CCR recovery.


Ideally all inflight requests should be held during controller recovery
and only retried after recovery is done. However, there are known
situations that is not the case in this implementation. These gaps will
be addressed in future patches:

- Manual controller reset from sysfs will result in controller going to
  RESETTING state and all inflight requests to be canceled immediately
  and maybe retried on another path.

- Manual controller delete from sysfs will also result in all inflight
  requests to be canceled immediately and maybe retried on another path.

- In nvme-fc nvme controller will be deleted if remote port disappears
  with no timeout specified. This results in immediate cancellation of
  requests that maybe retried on another path.

- In nvme-rdma if HCA is removed all nvme controllers will be deleted.
  This results in canceling inflight IOs and maybe they will be retred
  on another path.

- In nvme-fc if controller is LIVE and an IO ends with an error from
  LLDD, only this IO will be completed immediately. However, the rest of
  inflight IOs will be held correctly because the controller will have
  transitioned to RECOVERING state.



Mohamed Khalfella (14):
  nvmet: Rapid Path Failure Recovery set controller identify fields
  nvmet/debugfs: Add ctrl uniquifier and random values
  nvmet: Implement CCR nvme command
  nvmet: Implement CCR logpage
  nvmet: Send an AEN on CCR completion
  nvme: Rapid Path Failure Recovery read controller identify fields
  nvme: Add RECOVERING nvme controller state
  nvme: Implement cross-controller reset recovery
  nvme: Implement cross-controller reset completion
  nvme-tcp: Use CCR to recover controller that hits an error
  nvme-rdma: Use CCR to recover controller that hits an error
  nvme-fc: Decouple error recovery from controller reset
  nvme-fc: Use CCR to recover controller that hits an error
  nvme-fc: Hold inflight requests while in RECOVERING state

 drivers/nvme/host/constants.c   |   1 +
 drivers/nvme/host/core.c        | 197 +++++++++++++++++++++++++++++++-
 drivers/nvme/host/fc.c          | 194 ++++++++++++++++++++-----------
 drivers/nvme/host/nvme.h        |  24 ++++
 drivers/nvme/host/rdma.c        |  51 +++++++--
 drivers/nvme/host/sysfs.c       |  24 ++++
 drivers/nvme/host/tcp.c         |  52 +++++++--
 drivers/nvme/target/admin-cmd.c | 127 ++++++++++++++++++++
 drivers/nvme/target/core.c      | 103 ++++++++++++++++-
 drivers/nvme/target/debugfs.c   |  21 ++++
 drivers/nvme/target/nvmet.h     |  18 ++-
 include/linux/nvme.h            |  57 ++++++++-
 12 files changed, 778 insertions(+), 91 deletions(-)


base-commit: fd95357fd8c6778ac7dea6c57a19b8b182b6e91f
-- 
2.51.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ