[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 1 Jan 2017 10:30:02 -0500
From: Jason Baron <jbaron@...mai.com>
To: Bart Van Assche <Bart.VanAssche@...disk.com>,
"jejb@...ux.vnet.ibm.com" <jejb@...ux.vnet.ibm.com>,
"hch@...radead.org" <hch@...radead.org>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"sagi@...mberg.me" <sagi@...mberg.me>,
"sathya.prakash@...adcom.com" <sathya.prakash@...adcom.com>,
"suganath-prabu.subramani@...adcom.com"
<suganath-prabu.subramani@...adcom.com>,
"martin.petersen@...cle.com" <martin.petersen@...cle.com>,
"hare@...e.de" <hare@...e.de>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
"hch@....de" <hch@....de>,
"davem@...emloft.net" <davem@...emloft.net>,
"Sreekanth.Reddy@...adcom.com" <Sreekanth.Reddy@...adcom.com>,
"chaitra.basappa@...adcom.com" <chaitra.basappa@...adcom.com>,
"dledford@...hat.com" <dledford@...hat.com>
Subject: Re: [PATCH] scsi: mpt3sas: fix hang on ata passthru commands
On 01/01/2017 09:22 AM, Bart Van Assche wrote:
> On Sat, 2016-12-31 at 15:19 -0800, James Bottomley wrote:
>> On Thu, 2016-12-29 at 00:02 -0800, Christoph Hellwig wrote:
>>> On Wed, Dec 28, 2016 at 11:30:24PM -0500, Jason Baron wrote:
>>>> Add a new parameter to scsi_internal_device_block() to decide
>>>> whether or not to invoke scsi_wait_for_queuecommand().
>>> We'll also need to deal with the blk-mq wait path that Bart has been
>>> working on (I think it's already in the scsi tree, but I'd have to
>>> check).
>>>
>>> Also adding a bool flag for the last call in a function is style
>>> that's a little annoying.
>>>
>>> I'd prefer to add a scsi_internal_device_block_nowait that contains
>>> all the code except for the waiting, and then make
>>> scsi_internal_device_block_nowait a wrapper around it. Or drop the
>>> annoying internal for both while we're at it :)
>> OK, I know it's new year, but this is an unpatched regression in -rc1
>> that's causing serious issues. I would like this fixed by -rc3 so we
>> have 3 options
>>
>> 1. revert all the queuecommand wait stuff until it proves it's actually
>> working without regressions
>> 2. apply this patch and fix the style issues later
>> 3. someone else supplies the correctly styled fix patch
>>
>> The conservative in me says that 1. is probably the most correct thing
>> to do because it gives us time to get the queuecommand wait stuff
>> right; that's what I'll probably do if there's no movement next week.
>> However, since we're reasonably early in the -rc cycle, so either 2 or
>> 3 are viable provided no further regressions caused by the queuecommand
>> wait stuff pop up.
> Hello James,
>
> My recommendation is to revert commit 18f6084a989b ("scsi: mpt3sas: Fix
> secure erase premature termination"). Since the mpt3sas driver uses the
> single-queue approach and since the SCSI core unlocks the block layer
> request queue lock before the .queuecommand callback function is called,
> multiple threads can execute that callback function (scsih_qcmd() in this
> case) simultaneously. This means that using scsi_internal_device_block()
> from inside .queuecommand to serialize SCSI command execution is wrong.
>
> Bart.
Hi,
Yes, commit 18f6084a989b looked racy to me as well. I noted that
in a follow up to my patch, and a suggested a way of fixing it:
http://lkml.iu.edu/hypermail/linux/kernel/1612.3/01301.html
Also worth noting is that commit 18f6084a989b was backported
to at least 4.8 stable.
Thanks,
-Jason
Powered by blists - more mailing lists