[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240605091731.3111195-1-haowenchao22@gmail.com>
Date: Wed, 5 Jun 2024 17:17:28 +0800
From: Wenchao Hao <haowenchao22@...il.com>
To: "James E . J . Bottomley" <James.Bottomley@...senPartnership.com>,
"Martin K . Petersen" <martin.petersen@...cle.com>,
linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Wenchao Hao <haowenchao22@...il.com>
Subject: [PATCH v5 0/3] SCSI: Fix issues between removing device and error handle
2 issues are triggered because devices in removing would be skipped
when calling shost_for_each_device(), these issues are mainly in error
recovery path, which are:
1. statistic info printed at beginning of scsi_error_handler is wrong;
2. device reset is not triggered. drivers like smartpqi only implement
eh_device_reset_handler, if device reset is skipped, the commands
which had been sent to firmware or devices hardware are not cleared.
The error handle would flush all these commands in scsi_unjam_host().
When the commands are finished by hardware, use after free issue is
triggered.
The issue first happened with smartpqi devices, and can be reproduced
with scsi_debug. I did not see any description about SDEV_DEL state
can not perform device, so this is should be addressed.
A new macro shost_for_each_device_include_deleted() is added to address
these issues. The newly added macro would not skip scsi_device which is
in removing when iterate host's scsi_device and is called when statistic
host's error info and trying to reset scsi_device in error recovery path.
V5:
- Rewrite cover letter and add fixes tag to each patch
V4:
- Remove the forth patch which fix IO hang when device removing
becaust the issue is fixed by commit '6df0e077d76bd (scsi: core:
Kick the requeue list after inserting when flushing)'
V3:
- Update patch description
- Update comments of functions added
V2:
- Fix IO hang by run all devices' queue after error handler
- Do not modify shost_for_each_device() directly but add a new
helper to iterate devices but do not skip devices in removing
Wenchao Hao (3):
scsi: core: Add new helper to iterate all devices of host
scsi: scsi_error: Fix wrong statistic when print error info
scsi: scsi_error: Fix device reset is not triggered
drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------
drivers/scsi/scsi_error.c | 4 ++--
include/scsi/scsi_device.h | 25 ++++++++++++++++++---
3 files changed, 56 insertions(+), 19 deletions(-)
--
2.38.1
Powered by blists - more mailing lists