lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 29 Mar 2017 12:15:44 +0100 From: John Garry <john.garry@...wei.com> To: Johannes Thumshirn <jthumshirn@...e.de>, "Martin K . Petersen" <martin.petersen@...cle.com> CC: Tejun Heo <tj@...nel.org>, James Bottomley <jejb@...ux.vnet.ibm.com>, "Dan Williams" <dan.j.williams@...el.com>, Jack Wang <jinpu.wang@...fitbricks.com>, Hannes Reinecke <hare@...e.de>, Linux SCSI Mailinglist <linux-scsi@...r.kernel.org>, Linux Kernel Mailinglist <linux-kernel@...r.kernel.org> Subject: Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister On 29/03/2017 10:41, Johannes Thumshirn wrote: > In the advent of an SAS device unregister we have to wait for all destruct > works to be done to not accidently delay deletion of a SAS rphy or it's > children to the point when we're removing the SCSI or SAS hosts. > > Signed-off-by: Johannes Thumshirn <jthumshirn@...e.de> > --- > drivers/scsi/libsas/sas_discover.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c > index 60de662..75b18f1 100644 > --- a/drivers/scsi/libsas/sas_discover.c > +++ b/drivers/scsi/libsas/sas_discover.c > @@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) > } > > if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) { > + struct sas_discovery *disc = &dev->port->disc; > + struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work; > + > sas_rphy_unlink(dev->rphy); > list_move_tail(&dev->disco_list_node, &port->destroy_list); > sas_discover_event(dev->port, DISCE_DESTRUCT); > + flush_work(&sw->work); I quickly tested plugging out the expander and we never get past this call to flush - a hang results: root@(none)$ [ 243.357088] INFO: task kworker/u32:1:106 blocked for more than 120 seconds. [ 243.364030] Not tainted 4.11.0-rc1-13687-g2562e6a-dirty #1388 [ 243.370282] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.378086] kworker/u32:1 D 0 106 2 0x00000000 [ 243.383566] Workqueue: scsi_wq_0 sas_phye_loss_of_signal [ 243.388863] Call trace: [ 243.391314] [<ffff000008085d70>] __switch_to+0xa4/0xb0 [ 243.396442] [<ffff0000088f1134>] __schedule+0x1b4/0x5d0 [ 243.401654] [<ffff0000088f1588>] schedule+0x38/0x9c [ 243.406520] [<ffff0000088f4540>] schedule_timeout+0x194/0x294 [ 243.412249] [<ffff0000088f202c>] wait_for_common+0xb0/0x144 [ 243.417805] [<ffff0000088f20d4>] wait_for_completion+0x14/0x1c [ 243.423623] [<ffff0000080d5bd4>] flush_work+0xe0/0x1a8 [ 243.428747] [<ffff000008598158>] sas_unregister_dev+0xf8/0x110 [ 243.434563] [<ffff000008598304>] sas_unregister_domain_devices+0x4c/0xc8 [ 243.441242] [<ffff000008596884>] sas_deform_port+0x14c/0x15c [ 243.446886] [<ffff000008596508>] sas_phye_loss_of_signal+0x48/0x54 [ 243.453048] [<ffff0000080d6164>] process_one_work+0x138/0x2d8 [ 243.458776] [<ffff0000080d635c>] worker_thread+0x58/0x424 [ 243.464161] [<ffff0000080dc16c>] kthread+0xf4/0x120 [ 243.469024] [<ffff0000080836c0>] ret_from_fork+0x10/0x50 [ 364.189094] INFO: task kworker/u32:1:106 blocked for more than 120 seconds. [ 364.196035] Not tainted 4.11.0-rc1-13687-g2562e6a-dirty #1388 [ 364.202281] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 364.210085] kworker/u32:1 D 0 106 2 0x00000000 [ 364.215558] Workqueue: scsi_wq_0 sas_phye_loss_of_signal [ 364.220855] Call trace: [ 364.223303] [<ffff000008085d70>] __switch_to+0xa4/0xb0 [ 364.228428] [<ffff0000088f1134>] __schedule+0x1b4/0x5d0 [ 364.233640] [<ffff0000088f1588>] schedule+0x38/0x9c [ 364.238506] [<ffff0000088f4540>] schedule_timeout+0x194/0x294 [ 364.244237] [<ffff0000088f202c>] wait_for_common+0xb0/0x144 [ 364.249793] [<ffff0000088f20d4>] wait_for_completion+0x14/0x1c [ 364.255610] [<ffff0000080d5bd4>] flush_work+0xe0/0x1a8 [ 364.260736] [<ffff000008598158>] sas_unregister_dev+0xf8/0x110 [ 364.266551] [<ffff000008598304>] sas_unregister_domain_devices+0x4c/0xc8 [ 364.273230] [<ffff000008596884>] sas_deform_port+0x14c/0x15c [ 364.278872] [<ffff000008596508>] sas_phye_loss_of_signal+0x48/0x54 [ 364.285034] [<ffff0000080d6164>] process_one_work+0x138/0x2d8 [ 364.290763] [<ffff0000080d635c>] worker_thread+0x58/0x424 [ 364.296147] [<ffff0000080dc16c>] kthread+0xf4/0x120 [ 364.301013] [<ffff0000080836c0>] ret_from_fork+0x10/0x50 Is the issue that we are trying to flush the queue when we are working in the same queue context? Thanks, John > } > } > >
Powered by blists - more mailing lists