[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4E4A53F0.9040104@ce.jp.nec.com>
Date: Tue, 16 Aug 2011 20:26:40 +0900
From: "Jun'ichi Nomura" <j-nomura@...jp.nec.com>
To: Alan Stern <stern@...land.harvard.edu>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Tejun Heo <tj@...nel.org>
CC: jaxboe@...ionio.com, roland@...estorage.com,
linux-scsi@...r.kernel.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
device-mapper development <dm-devel@...hat.com>,
Kiyoshi Ueda <k-ueda@...jp.nec.com>
Subject: Re: [BUG] Oops when SCSI device under multipath is removed
Hi,
On 08/12/11 00:16, Alan Stern wrote:
> On Thu, 11 Aug 2011, James Bottomley wrote:
>> However, much as I'd like to accept this rosy view, the original oops
>> that started all of this in 2.6.38 was someone caught something with a
>> reference to a SCSI queue after the device release function had been
>> called.
>
> Not according to your commit log. You wrote that the reference was
> taken after scsi_remove_device() had been called -- but the device
> release function is scsi_device_dev_release_usercontext().
The commit log of 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b
("[SCSI] put stricter guards on queue dead checks") does not
explain about the move of scsi_free_queue().
But according to the discussion below, it seems
the move was motivated to solve the following self-deadlock:
https://lkml.org/lkml/2011/4/12/9
[in the context of kblockd_workqueue]
blk_delay_work
__blk_run_queue
scsi_request_fn
put_device
(puts final sdev refcount)
scsi_device_dev_release
execute_in_process_context(scsi_device_dev_release_usercontext)
[execute immediately because it's in process context]
scsi_device_dev_release_usercontext
scsi_free_queue
blk_cleanup_queue
blk_sync_queue
(wait for blk_delay_work to complete...)
James, is my understanding correct?
If so, isn't it possible to move the scsi_free_queue back to
the original place and solve the deadlock instead by
avoiding the wait in the same context?
@@ -338,8 +339,8 @@ static void scsi_device_dev_release_user
static void scsi_device_dev_release(struct device *dev)
{
struct scsi_device *sdp = to_scsi_device(dev);
- execute_in_process_context(scsi_device_dev_release_usercontext,
- &sdp->ew);
+ INIT_WORK(&sdp->ew.work, scsi_device_dev_release_usercontext);
+ schedule_work(&sdp->ew.work);
}
static struct class sdev_class = {
Thanks,
--
Jun'ichi Nomura, NEC Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists