linux-kernel - Re: [RFC PATCH 08/14] nvme: Implement cross-controller reset recovery

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPpK+O152NEqCrzzLEsUiDiO=CS6OYLfeZ4RN-KGVSH2XTXMOA@mail.gmail.com>
Date: Tue, 6 Jan 2026 19:16:36 -0800
From: Randy Jennings <randyj@...estorage.com>
To: Sagi Grimberg <sagi@...mberg.me>
Cc: Mohamed Khalfella <mkhalfella@...estorage.com>, Chaitanya Kulkarni <kch@...dia.com>, 
	Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>, Keith Busch <kbusch@...nel.org>, 
	Aaron Dailey <adailey@...estorage.com>, John Meneghini <jmeneghi@...hat.com>, 
	Hannes Reinecke <hare@...e.de>, linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 08/14] nvme: Implement cross-controller reset recovery

On Sun, Jan 4, 2026 at 1:14 PM Sagi Grimberg <sagi@...mberg.me> wrote:
> On 31/12/2025 2:04, Randy Jennings wrote:
> >>> +
> >>> +             if (!ret) {
> >>> +                     dev_info(ictrl->device, "CCR succeeded using %s\n",
> >>> +                              dev_name(sctrl->device));
> >>> +                     blk_put_queue(sctrl->admin_q);
> >>> +                     nvme_put_ctrl(sctrl);
> >>> +                     return 0;
> >>> +             }
> >>> +
> >>> +             /* Try another controller */
> >>> +             min_cntlid = sctrl->cntlid + 1;
> >> OK, I see why min_cntlid is used. That is very non-intuitive.
> >>
> >> I'm wandering if it will be simpler to take one-shot at ccr and
> >> if it fails fallback to crt. I mean, if the sctrl is alive, and it was
> >> unable
> >> to reset the ictrl in time, how would another ctrl do a better job here?
> > There are many different kinds of failures we are dealing with here
> > that result in a dropped connection (association).  It could be a problem
> > with the specific link, or it could be that the node of an HA pair in the
> > storage array went down.  In the case of a specific link problem, maybe
> > only one of the connections is down and any controller would work.
> > In the case of the node of an HA pair, roughly half of the connections
> > are going down, and there is a race between the controllers which
> > are detected down first.  There were some heuristics put into the
> > spec about deciding which controller to use, but that is more code
> > and a refinement that could come later (and they are still heuristics;
> > they may not be helpful).
> >
> > Because CCR offers a significant win of shortening the recovery time
> > substantially, it is worth retrying on the other controllers. This time
> > affects when we can start retrying IO.  KATO is in seconds, and
> > NVMEoF should have the capability of doing a significant amount of
> > IOs in each of those seconds.
>
> But it doesn't actually do I/O, it issues I/O and then wait for it to
> time out.
Retrying CCR does not actually do I/O (trying to place your antecedent),
but a successful CCR allows the host to get back to doing I/O.  Every
second saved can be a significant amount of I/O.  If you were given a
choice between a 1 second failover and a 60 second failover, of course,
you would go for the 1 second failover.  However, if I was given the
option of a 10 second failover and a 60 second failover, I would still
go for the 10 second failover.  50 seconds is still extremely valuable.

>
> >
> > Besides, the alternative is just to wait.  Might as well be actively trying
> > to shorten that wait time.  Besides a small increase in code complexity,
> > is there a downside to doing so?
>
> Simplicity is very important when it comes to non-trivial code paths
> like error recovery.
Okay, yes, unwarranted complexity, even with some benefit might not
be worth it.  I can see that my comment could be taken as flippant.  But
the extra complexity here yields an important and material benefit.

Sincerely,
Randy Jennings