lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250109-nvme-fc-handle-com-lost-v4-0-fe5cae17b492@kernel.org>
Date: Thu, 09 Jan 2025 14:30:46 +0100
From: Daniel Wagner <wagi@...nel.org>
To: James Smart <james.smart@...adcom.com>, Keith Busch <kbusch@...nel.org>, 
 Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>, 
 Hannes Reinecke <hare@...e.de>, Paul Ely <paul.ely@...adcom.com>
Cc: linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org, 
 Daniel Wagner <wagi@...nel.org>
Subject: [PATCH v4 0/3] nvme-fc: fix race with connectivity loss and
 nvme_fc_create_association

As requested by Sagi, I've fixed the race window in nvme-fc by using the
already existing ASSOC_FAILED flag for tracking the connectivity loss.

Daniel

previous cover letter:

We got a bug report that a controller was stuck in the connected state
after an association dropped.

It turns out that nvme_fc_create_association can succeed even though some
operation do fail. This is on purpose to handle the degraded controller
case, where the admin queue is up and running but not the io queues. In
this case the controller will still reach the LIVE state.

Unfortunatly, this will also ignore full connectivity loss for fabric
controllers. Let's address this by not filtering out all errors in
nvme_set_queue_count.

---
Changes in v4:
- collected review tags
- dropped "nvme_ctrl_reset to keep alive end io handler" again
- added "nvme-fc: do not ignore connectivity loss during connecting"
- Link to v3: https://lore.kernel.org/r/20241129-nvme-fc-handle-com-lost-v3-0-d8967b3cae54@kernel.org

Changes in v3:
- collected reviewed tags
- added "nvme_ctrl_reset to keep alive end io handler"
- Link to v2: https://lore.kernel.org/r/20241029-nvme-fc-handle-com-lost-v2-0-5b0d137e2a0a@kernel.org

Changes in v2:
- handle connection lost in nvme_set_queue_count directly
- collected reviewed tags
- Link to v1: https://lore.kernel.org/r/20240611190647.11856-1-dwagner@suse.de

---
Daniel Wagner (3):
      nvme-fc: go straight to connecting state when initializing
      nvme: handle connectivity loss in nvme_set_queue_count
      nvme-fc: do not ignore connectivity loss during connecting

 drivers/nvme/host/core.c |  8 +++++++-
 drivers/nvme/host/fc.c   | 26 +++++++++++++++++++-------
 2 files changed, 26 insertions(+), 8 deletions(-)
---
base-commit: b9973aa4d0507c4969ad87763b535edb77b7dceb
change-id: 20241029-nvme-fc-handle-com-lost-9b241936809a

Best regards,
-- 
Daniel Wagner <wagi@...nel.org>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ