lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 29 Mar 2017 14:26:31 +0200 From: Johannes Thumshirn <jthumshirn@...e.de> To: John Garry <john.garry@...wei.com> Cc: "Martin K . Petersen" <martin.petersen@...cle.com>, Tejun Heo <tj@...nel.org>, James Bottomley <jejb@...ux.vnet.ibm.com>, Dan Williams <dan.j.williams@...el.com>, Jack Wang <jinpu.wang@...fitbricks.com>, Hannes Reinecke <hare@...e.de>, Linux SCSI Mailinglist <linux-scsi@...r.kernel.org>, Linux Kernel Mailinglist <linux-kernel@...r.kernel.org> Subject: Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device unregister On Wed, Mar 29, 2017 at 12:53:28PM +0100, John Garry wrote: > On 29/03/2017 12:29, Johannes Thumshirn wrote: > >On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote: > >>On 29/03/2017 10:41, Johannes Thumshirn wrote: > >>>In the advent of an SAS device unregister we have to wait for all destruct > >>>works to be done to not accidently delay deletion of a SAS rphy or it's > >>>children to the point when we're removing the SCSI or SAS hosts. > >>> > >>>Signed-off-by: Johannes Thumshirn <jthumshirn@...e.de> > >>>--- > >>>drivers/scsi/libsas/sas_discover.c | 4 ++++ > >>>1 file changed, 4 insertions(+) > >>> > >>>diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c > >>>index 60de662..75b18f1 100644 > >>>--- a/drivers/scsi/libsas/sas_discover.c > >>>+++ b/drivers/scsi/libsas/sas_discover.c > >>>@@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) > >>> } > >>> > >>> if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) { > >>>+ struct sas_discovery *disc = &dev->port->disc; > >>>+ struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work; > >>>+ > >>> sas_rphy_unlink(dev->rphy); > >>> list_move_tail(&dev->disco_list_node, &port->destroy_list); > >>> sas_discover_event(dev->port, DISCE_DESTRUCT); > >>>+ flush_work(&sw->work); > >> > >>I quickly tested plugging out the expander and we never get past this call > >>to flush - a hang results: > > > >Can you activat lockdep so we can see which lock it is that we're blocking on? > > > > I have it on: > CONFIG_LOCKDEP_SUPPORT=y > CONFIG_LOCKD=y > CONFIG_LOCKD_V4=y > > >It's most likely in sas_unregister_common_dev() but this function takes two spin > >locks, port->dev_list_lock and ha->lock. > > > > We can see from the callstack I provided that we're working in workqueue > scsi_wq_0 and trying to flush that same queue. Aaahh, now I get what's happening (with some kicks^Whelp from Hannes I admit). The sas_unregister_dev() comes from the work queued by notify_phy_event(). So this patch must be replaced by (untested): diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c index cdbb293..e1e6492 100644 --- a/drivers/scsi/scsi_transport_sas.c +++ b/drivers/scsi/scsi_transport_sas.c @@ -375,6 +375,7 @@ void sas_remove_children(struct device *dev) */ void sas_remove_host(struct Scsi_Host *shost) { + scsi_flush_work(shost); sas_remove_children(&shost->shost_gendev); } EXPORT_SYMBOL(sas_remove_host); John, mind giving that one a shot in your test setup as well? Thanks, Johannes -- Johannes Thumshirn Storage jthumshirn@...e.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Powered by blists - more mailing lists