linux-kernel - Re: [PATCH v2 08/14] nvme: Implement cross-controller reset recovery

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5f3c9cf0-7fee-432a-b6c5-44fb2acb0b1d@gmail.com>
Date: Tue, 10 Feb 2026 14:49:15 -0800
From: James Smart <jsmart833426@...il.com>
To: Mohamed Khalfella <mkhalfella@...estorage.com>
Cc: Justin Tee <justin.tee@...adcom.com>,
 Naresh Gottumukkala <nareshgottumukkala83@...il.com>,
 Paul Ely <paul.ely@...adcom.com>, Chaitanya Kulkarni <kch@...dia.com>,
 Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>,
 Keith Busch <kbusch@...nel.org>, Sagi Grimberg <sagi@...mberg.me>,
 Aaron Dailey <adailey@...estorage.com>,
 Randy Jennings <randyj@...estorage.com>,
 Dhaval Giani <dgiani@...estorage.com>, Hannes Reinecke <hare@...e.de>,
 linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 08/14] nvme: Implement cross-controller reset recovery

On 2/10/2026 2:27 PM, Mohamed Khalfella wrote:
> On Tue 2026-02-10 14:09:27 -0800, James Smart wrote:
>> On 1/30/2026 2:34 PM, Mohamed Khalfella wrote:
>> ...
>>> +unsigned long nvme_fence_ctrl(struct nvme_ctrl *ictrl)
>>> +{
>>> +	unsigned long deadline, now, timeout;
>>> +	struct nvme_ctrl *sctrl;
>>> +	u32 min_cntlid = 0;
>>> +	int ret;
>>> +
>>> +	timeout = nvme_fence_timeout_ms(ictrl);
>>> +	dev_info(ictrl->device, "attempting CCR, timeout %lums\n", timeout);
>>> +
>>> +	now = jiffies;
>>> +	deadline = now + msecs_to_jiffies(timeout);
>>> +	while (time_before(now, deadline)) {
>>
>> Q: don't we have something to identify the controller's subsystem
>> supports CCR before we starting selecting controllers and sending CCR ?
>>
>> I would think on older devices that don't support it we should be
>> skipping this loop.   The loop could delay the Time-Based delay without
>> any CCR.
> 
> I do not think we have something that identifies CCR support at
> subsystem level. The spec defines CCRL at the controller level. The loop
> should not that bad. nvme_find_ctrl_ccr() should return NULL if CCR is
> not supported and nvme_fence_ctrl() will return immediately.
> 
>>
>> -- james
>>

I would think CCRL on the failed controller would be enough to assume 
the subsystem supports it.

I'm not worried about the coding on the host is so bad. It's more the 
multiple paths that must have cmds sent to them and getting error 
responses for unknown cmds (should be responded to ok, but you never 
know) as well as creating conditions for other errors where there will 
be no return for it - e.g. other paths losing connectivity while the ccr 
outstanding, etc. yes, they all have to work, but why bother adding 
these flows to an old controller that would never do CCR ?

-- james