lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240605091731.3111195-1-haowenchao22@gmail.com>
Date: Wed,  5 Jun 2024 17:17:28 +0800
From: Wenchao Hao <haowenchao22@...il.com>
To: "James E . J . Bottomley" <James.Bottomley@...senPartnership.com>,
	"Martin K . Petersen" <martin.petersen@...cle.com>,
	linux-scsi@...r.kernel.org,
	linux-kernel@...r.kernel.org
Cc: Wenchao Hao <haowenchao22@...il.com>
Subject: [PATCH v5 0/3] SCSI: Fix issues between removing device and error handle

2 issues are triggered because devices in removing would be skipped
when calling shost_for_each_device(), these issues are mainly in error
recovery path, which are:

1. statistic info printed at beginning of scsi_error_handler is wrong;
2. device reset is not triggered. drivers like smartpqi only implement
   eh_device_reset_handler, if device reset is skipped, the commands
   which had been sent to firmware or devices hardware are not cleared.
   The error handle would flush all these commands in scsi_unjam_host().
   When the commands are finished by hardware, use after free issue is
   triggered.
   The issue first happened with smartpqi devices, and can be reproduced
   with scsi_debug. I did not see any description about SDEV_DEL state
   can not perform device, so this is should be addressed.

A new macro shost_for_each_device_include_deleted() is added to address
these issues. The newly added macro would not skip scsi_device which is
in removing when iterate host's scsi_device and is called when statistic
host's error info and trying to reset scsi_device in error recovery path.

V5:
 - Rewrite cover letter and add fixes tag to each patch

V4:
 - Remove the forth patch which fix IO hang when device removing
   becaust the issue is fixed by commit '6df0e077d76bd (scsi: core:
   Kick the requeue list after inserting when flushing)'

V3:
  - Update patch description
  - Update comments of functions added

V2:
  - Fix IO hang by run all devices' queue after error handler
  - Do not modify shost_for_each_device() directly but add a new
    helper to iterate devices but do not skip devices in removing

Wenchao Hao (3):
  scsi: core: Add new helper to iterate all devices of host
  scsi: scsi_error: Fix wrong statistic when print error info
  scsi: scsi_error: Fix device reset is not triggered

 drivers/scsi/scsi.c        | 46 ++++++++++++++++++++++++++------------
 drivers/scsi/scsi_error.c  |  4 ++--
 include/scsi/scsi_device.h | 25 ++++++++++++++++++---
 3 files changed, 56 insertions(+), 19 deletions(-)

-- 
2.38.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ