lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d8fa9332-9475-433e-a203-e376845dcc66@roeck-us.net>
Date: Wed, 30 Apr 2025 09:12:30 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Keith Busch <kbusch@...nel.org>, Daniel Wagner <dwagner@...e.de>
Cc: Hannes Reinecke <hare@...e.de>, Daniel Wagner <wagi@...nel.org>,
 Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
 Sagi Grimberg <sagi@...mberg.me>, James Smart <james.smart@...adcom.com>,
 Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
 linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] nvme: only allow entering LIVE from CONNECTING state

On 4/30/25 09:01, Keith Busch wrote:
> On Wed, Apr 30, 2025 at 08:43:04AM +0200, Daniel Wagner wrote:
>> On Tue, Apr 29, 2025 at 11:42:25AM -0700, Keith Busch wrote:
>>> On Tue, Apr 29, 2025 at 11:23:25AM -0700, Guenter Roeck wrote:
>>>> On 4/29/25 11:13, Keith Busch wrote:
>>>>> On Mon, Apr 28, 2025 at 03:21:18PM +0200, Hannes Reinecke wrote:
>>>>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>>>>> index b502ac07483b..d3c4eacf607f 100644
>>>>>>> --- a/drivers/nvme/host/core.c
>>>>>>> +++ b/drivers/nvme/host/core.c
>>>>>>> @@ -4493,7 +4493,8 @@ static void nvme_fw_act_work(struct work_struct *work)
>>>>>>>                    msleep(100);
>>>>>>>            }
>>>>>>>
>>>>>>> -       if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>>> +       if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING) ||
>>>>>>> +           !nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE))
>>>>>>>                    return;
>>>>>>>
>>>>>>>            nvme_unquiesce_io_queues(ctrl);
>>>>>>
>>>>>> I would rather have a separate state for firmware activation.
>>>>>> (Ab-)using the 'RESETTING' state here has direct implications
>>>>>> with the error handler, as for the error handler 'RESETTING'
>>>>>> means that the error handler has been scheduled.
>>>>>> Which is not true for firmware activation.
>>>>>
>>>>> But the point of having firmware activation set the state to RESETTING
>>>>> was to fence off error handling from trying to schedule a real reset.
>>>>> The fw activation work schedules its own recovery if it times out, but
>>>>> we don't want any other recovery action or user requested resets to
>>>>> proceed while an activation is still pending.
>>>>
>>>> Not only that; there are various checks against NVME_CTRL_RESETTING
>>>> sprinkled through the code. What is the impact of introducing a new state
>>>> without handling all those checks ?
>>>
>>> Good point, bad things will happen if these checks are not updated to
>>> know about the new state. For example, nvme-pci will attempt aborting IO
>>> or disabling the controller on a timeout instead of restarting the timer
>>> as desired.
>>>
>>> Can we just revert the commit that prevented the RESETTING -> LIVE
>>> transtion for now?
>>
>> Unfortunately, it will break the FC error handling again(*). The
>> simplest fix is right above, add the transition from RESETTING to
>> CONNECTING and then to LIVE, IMO.
> 
> Gotcha, yes, that looks like the simplest fix for the current release
> then. We need to be careful with adding new states, so we can revisit
> Hannes' suggestion for 6.16 if we really want to split this.
> 
> If you send the simple fix as a formal patch, please add my review and
> the "Fixes:" tag.
> 
> Reviewed-by: Keith Busch <kbusch@...nel.org>

Please feel free to add mine as well:

Reviewed-by: Guenter Roeck <linux@...ck-us.net>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ