lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <343ddf8b-70e0-32f8-6ab8-31479729f827@huawei.com>
Date:   Wed, 29 Mar 2017 12:53:28 +0100
From:   John Garry <john.garry@...wei.com>
To:     Johannes Thumshirn <jthumshirn@...e.de>
CC:     "Martin K . Petersen" <martin.petersen@...cle.com>,
        Tejun Heo <tj@...nel.org>,
        James Bottomley <jejb@...ux.vnet.ibm.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Jack Wang <jinpu.wang@...fitbricks.com>,
        "Hannes Reinecke" <hare@...e.de>,
        Linux SCSI Mailinglist <linux-scsi@...r.kernel.org>,
        Linux Kernel Mailinglist <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] scsi: sas: flush destruct workqueue on device
 unregister

On 29/03/2017 12:29, Johannes Thumshirn wrote:
> On Wed, Mar 29, 2017 at 12:15:44PM +0100, John Garry wrote:
>> On 29/03/2017 10:41, Johannes Thumshirn wrote:
>>> In the advent of an SAS device unregister we have to wait for all destruct
>>> works to be done to not accidently delay deletion of a SAS rphy or it's
>>> children to the point when we're removing the SCSI or SAS hosts.
>>>
>>> Signed-off-by: Johannes Thumshirn <jthumshirn@...e.de>
>>> ---
>>> drivers/scsi/libsas/sas_discover.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
>>> index 60de662..75b18f1 100644
>>> --- a/drivers/scsi/libsas/sas_discover.c
>>> +++ b/drivers/scsi/libsas/sas_discover.c
>>> @@ -382,9 +382,13 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev)
>>> 	}
>>>
>>> 	if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) {
>>> +		struct sas_discovery *disc = &dev->port->disc;
>>> +		struct sas_work *sw = &disc->disc_work[DISCE_DESTRUCT].work;
>>> +
>>> 		sas_rphy_unlink(dev->rphy);
>>> 		list_move_tail(&dev->disco_list_node, &port->destroy_list);
>>> 		sas_discover_event(dev->port, DISCE_DESTRUCT);
>>> +		flush_work(&sw->work);
>>
>> I quickly tested plugging out the expander and we never get past this call
>> to flush - a hang results:
>
> Can you activat lockdep so we can see which lock it is that we're blocking on?
>

I have it on:
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y

> It's most likely in sas_unregister_common_dev() but this function takes two spin
> locks, port->dev_list_lock and ha->lock.
>

We can see from the callstack I provided that we're working in workqueue 
scsi_wq_0 and trying to flush that same queue.

Much appreciated,
John

> Thanks a lot,
>        Johannes
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ