lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <698d4b22-719c-4e57-94ed-f507e425ee12@suse.de>
Date: Thu, 13 Jun 2024 08:27:38 +0200
From: Hannes Reinecke <hare@...e.de>
To: Wenchao Hao <haowenchao22@...il.com>,
 "James E . J . Bottomley" <James.Bottomley@...senPartnership.com>,
 "Martin K . Petersen" <martin.petersen@...cle.com>,
 linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 1/3] scsi: core: Add new helper to iterate all devices
 of host

On 6/12/24 17:06, Wenchao Hao wrote:
> On 6/12/24 4:33 PM, Hannes Reinecke wrote:
>> On 6/5/24 11:17, Wenchao Hao wrote:
>>> shost_for_each_device() would skip devices which is in SDEV_CANCEL or
>>> SDEV_DEL state, for some scenarios, we donot want to skip these devices,
>>> so add a new macro shost_for_each_device_include_deleted() to handle it.
>>>
>>> Following changes are introduced:
>>>
>>> 1. Rework scsi_device_get(), add new helper __scsi_device_get() which
>>>      determine if skip deleted scsi_device by parameter "skip_deleted".
>>> 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which
>>>      is used when calling __scsi_device_get()
>>> 3. Update shost_for_each_device() to call __scsi_iterate_devices() with
>>>      "skip_deleted" true
>>> 4. Add new macro shost_for_each_device_include_deleted() which call
>>>      __scsi_iterate_devices() with "skip_deleted" false
>>>
>>> Signed-off-by: Wenchao Hao <haowenchao22@...il.com>
>>> ---
>>>    drivers/scsi/scsi.c        | 46 ++++++++++++++++++++++++++------------
>>>    include/scsi/scsi_device.h | 25 ++++++++++++++++++---
>>>    2 files changed, 54 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
>>> index 3e0c0381277a..5913de543d93 100644
>>> --- a/drivers/scsi/scsi.c
>>> +++ b/drivers/scsi/scsi.c
>>> @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable)
>>>        return 0;
>>>    }
>>>    -/**
>>> - * scsi_device_get  -  get an additional reference to a scsi_device
>>> +/*
>>> + * __scsi_device_get  -  get an additional reference to a scsi_device
>>>     * @sdev:    device to get a reference to
>>> - *
>>> - * Description: Gets a reference to the scsi_device and increments the use count
>>> - * of the underlying LLDD module.  You must hold host_lock of the
>>> - * parent Scsi_Host or already have a reference when calling this.
>>> - *
>>> - * This will fail if a device is deleted or cancelled, or when the LLD module
>>> - * is in the process of being unloaded.
>>> + * @skip_deleted: when true, would return failed if device is deleted
>>>     */
>>> -int scsi_device_get(struct scsi_device *sdev)
>>> +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted)
>>>    {
>>> -    if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
>>> +    /*
>>> +     * if skip_deleted is true and device is in removing, return failed
>>> +     */
>>> +    if (skip_deleted &&
>>> +        (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL))
>>>            goto fail;
>>
>> Nack.
>> SDEV_DEL means the device is about to be deleted, so we _must not_ access it at all.
>>
> 
> Sorry I added SDEV_DEL here at hand without understanding what it means.
> Actually, just include scsi_device which is in SDEV_CANCEL would fix the
> issues I described.
> 
> The issues are because device removing concurrent with error handle.
> Normally, error handle would not be triggered when scsi_device is in
> SDEV_DEL. Below is my analysis, if it is wrong, please correct me.
> 
> If there are scsi_cmnd remain unfinished when removing scsi_device,
> the removing process would waiting for all commands to be finished.
> If commands error happened and trigger error handle, the removing
> process would be blocked until error handle finished, because
> __scsi_remove_device called  del_gendisk() which would wait all
> requests to be finished. So now scsi_device is in SDEV_CANCEL.
> 
> If the scsi_device is already in SDEV_DEL, then no scsi_cmnd has been
> dispatched to this scsi_device, then error handle would never triggered.
> 
> I want to change the new function __scsi_device_get() as following,
> please help to review.
> 
> /*
>   * __scsi_device_get  -  get an additional reference to a scsi_device
>   * @sdev:	device to get a reference to
>   * @skip_canceled: when true, would return failed if device is deleted
>   */
> static int __scsi_device_get(struct scsi_device *sdev, bool skip_canceled)
> {
> 	/*
> 	 * if skip_canceled is true and device is in removing, return failed
> 	 */
> 	if (sdev->sdev_state == SDEV_DEL ||
> 	    (sdev->sdev_state == SDEV_CANCEL && skip_canceled))
> 		goto fail;
> 	if (!try_module_get(sdev->host->hostt->module))
> 		goto fail;
> 	if (!get_device(&sdev->sdev_gendev))
> 		goto fail_put_module;
> 	return 0;
> 
> fail_put_module:
> 	module_put(sdev->host->hostt->module);
> fail:
> 	return -ENXIO;
> }
> 
I don't think that's required.
With your above analysis, wouldn't the problem be solved with:

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 775df00021e4..911fcfa80d69 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1470,6 +1470,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
         if (sdev->sdev_state == SDEV_DEL)
                 return;

+       scsi_block_when_processing_errors(sdev);
+
         if (sdev->is_visible) {
                 /*
                  * If scsi_internal_target_block() is running concurrently,

Hmm?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@...e.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ