[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce9cef41-29e2-4056-a60b-b0e4ee1cc17e@acm.org>
Date: Fri, 22 Sep 2023 08:18:26 -0700
From: Bart Van Assche <bvanassche@....org>
To: Wenchao Hao <haowenchao2@...wei.com>,
"James E . J . Bottomley" <jejb@...ux.ibm.com>,
"Martin K . Petersen" <martin.petersen@...cle.com>,
open-iscsi@...glegroups.com, linux-scsi@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, louhongxiang@...wei.com
Subject: Re: [PATCH 2/2] scsi: scsi_error: Fix device reset is not triggered
On 9/22/23 02:36, Wenchao Hao wrote:
> Fix the issue of skipping scsi_try_bus_device_reset() for devices
> which is in progress of removing in following order:
>
> T1: T2:scsi_error_handle
> __scsi_remove_device
> scsi_device_set_state(sdev, SDEV_DEL)
> // would skip device with SDEV_DEL state
> shost_for_each_device()
> scsi_try_bus_device_reset
> flush all commands
> ...
> scsi_device is released
>
> Some drivers like smartpqi only implement eh_device_reset_handler,
> if device reset is skipped, the commands which had been sent to
> firmware or devices hardware are not cleared. The error handle
> would flush all these commands in scsi_unjam_host().
>
> When the commands are finished by hardware, use after free issue is
> triggered.
>
> Add parameter "check_state" to macro shost_for_each_device() to
> determine if check device status when traversal scsi_device
> of Scsi_Host, and set this parameter to false when traversal
> in scsi_error_handle to address this issue.
The above is incomprehensible to me. Please explain more clearly why
this change is needed.
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index d0911bc28663..db8b9e42267c 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -704,6 +704,23 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable)
> return 0;
> }
>
> +static int __scsi_device_get(struct scsi_device *sdev, bool check_state)
"check_state" is a bad argument name because it does not clearly explain
the purpose of this argument. Would "include_deleted" perhaps be a
better name?
> +{
> + if (check_state &&
> + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL))
> + goto fail;
> + if (!try_module_get(sdev->host->hostt->module))
> + goto fail;
> + if (!get_device(&sdev->sdev_gendev))
> + goto fail_put_module;
> + return 0;
> +
> +fail_put_module:
> + module_put(sdev->host->hostt->module);
> +fail:
> + return -ENXIO;
> +}
Looking at the above code, I think we need two functions: one that does
not include the sdev->sdev_state check and a second function that
includes the sdev->sdev_state check (scsi_device_get()) and calls the
first. That will result in code that is easier to read than calls to a
function with a boolean argument.
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index c498a12f7715..e166d053c839 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -389,21 +389,25 @@ extern void __starget_for_each_device(struct scsi_target *, void *,
>
> /* only exposed to implement shost_for_each_device */
> extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *,
> - struct scsi_device *);
> + struct scsi_device *,
> + bool);
>
> /**
> * shost_for_each_device - iterate over all devices of a host
> * @sdev: the &struct scsi_device to use as a cursor
> * @shost: the &struct scsi_host to iterate over
> + * @check_state: if skip check scsi_device's state to skip some devices
> + * scsi_device with SDEV_DEL or SDEV_CANCEL would be skipped
> + * if this is true
> *
> * Iterator that returns each device attached to @shost. This loop
> * takes a reference on each device and releases it at the end. If
> * you break out of the loop, you must call scsi_device_put(sdev).
> */
> -#define shost_for_each_device(sdev, shost) \
> - for ((sdev) = __scsi_iterate_devices((shost), NULL); \
> +#define shost_for_each_device(sdev, shost, check_state) \
> + for ((sdev) = __scsi_iterate_devices((shost), NULL, check_state); \
> (sdev); \
> - (sdev) = __scsi_iterate_devices((shost), (sdev)))
> + (sdev) = __scsi_iterate_devices((shost), (sdev), check_state))
>
> /**
> * __shost_for_each_device - iterate over all devices of a host (UNLOCKED)
Since only the SCSI error handler passes 0 as 'check_state' argument to
shost_for_each_device(), instead of adding a boolean argument to that
macro, please do the following:
* Introduce a new macro for the check_state = 1 case.
* Keep the semantics for shost_for_each_device().
With this approach no SCSI LLDs will have to be modified.
Thanks,
Bart.
Powered by blists - more mailing lists