lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230928073543.3496394-5-haowenchao2@huawei.com>
Date:   Thu, 28 Sep 2023 15:35:43 +0800
From:   Wenchao Hao <haowenchao2@...wei.com>
To:     "James E . J . Bottomley" <jejb@...ux.ibm.com>,
        "Martin K . Petersen" <martin.petersen@...cle.com>,
        <linux-scsi@...r.kernel.org>
CC:     <linux-kernel@...r.kernel.org>, <louhongxiang@...wei.com>,
        Wenchao Hao <haowenchao2@...wei.com>
Subject: [PATCH v2 4/4] scsi: scsi_core:  Fix IO hang when device removing

shost_for_each_device() would skip devices which is in progress of
removing, so scsi_run_queue() for these devices would be skipped in
scsi_run_host_queues() after blocking hosts' IO.

IO hang would be caused if return true when state is SDEV_CANCEL with
following order:

T1:					    T2:scsi_error_handler
__scsi_remove_device()
  scsi_device_set_state(sdev, SDEV_CANCEL)
  ...
  sd_remove()
  del_gendisk()
  blk_mq_freeze_queue_wait()
  					    scsi_eh_flush_done_q()
					      scsi_queue_insert(scmd,...)

Because scsi_queue_insert() would not kick device's queue after commit
8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary")

After scsi_unjam_host(), the scsi error handler would call scsi_run_queue()
to trigger run queue for devices, while it would not run queue for
devices which is in progress of removing because shost_for_each_device()
would skip them.

So the requests added to these queues would not be handled any more,
and the removing device process would hang too.

Fix this issue by using shost_for_each_device_include_deleted() in
scsi_run_queue() to trigger a run queue for devices in removing.

Signed-off-by: Wenchao Hao <haowenchao2@...wei.com>
---
 drivers/scsi/scsi_lib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index c2f647a7c1b0..34b408d182e2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -466,7 +466,7 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
 {
 	struct scsi_device *sdev;
 
-	shost_for_each_device(sdev, shost)
+	shost_for_each_device_include_deleted(sdev, shost)
 		scsi_run_queue(sdev->request_queue);
 }
 
-- 
2.32.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ