[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a9d09c59-5e5c-c271-345e-b4349968bb0b@suse.de>
Date: Wed, 24 Aug 2016 16:21:38 +0200
From: Hannes Reinecke <hare@...e.de>
To: John Garry <john.garry@...wei.com>, jejb@...ux.vnet.ibm.com,
martin.petersen@...cle.com
Cc: linuxarm@...wei.com, zhangfei.gao@...aro.org, xuwei5@...ilicon.com,
john.garry2@...l.dcu.ie, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 5/8] hisi_sas: add v2 hw slot complete internal abort
support
On 08/24/2016 04:07 PM, John Garry wrote:
> On 24/08/2016 13:59, Hannes Reinecke wrote:
>> On 08/24/2016 01:05 PM, John Garry wrote:
>>> Add code in slot_complete_v2_hw() to deal with the
>>> slots which have completed due to internal abort.
>>>
>>> The status codes have the following meaning:
>>> - STAT_IO_ABORTED: the IO has been aborted due to
>>> internal abort, whether by device or individual
>>> abort command
>>> - STAT_IO_COMPLETE: internal abort command has
>>> completed successfully for device or individual
>>> abort command
>>> - STAT_IO_NO_DEVICE: internal abort command has
>>> completed for device but cannot find any IO
>>> - STAT_IO_NOT_VALID: internal abort command has
>>> completed for single command but could not
>>> find the command
>>>
>>> Signed-off-by: John Garry <john.garry@...wei.com>
>>> ---
>>> drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 31
>>> +++++++++++++++++++++++++++++++
>>> 1 file changed, 31 insertions(+)
>>>
>>> diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
>>> b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
>>> index fec1675..bf9b693 100644
>>> --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
>>> +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
>>> @@ -227,6 +227,13 @@
>>> #define CMPLT_HDR_RSPNS_XFRD_MSK (0x1 << CMPLT_HDR_RSPNS_XFRD_OFF)
>>> #define CMPLT_HDR_ERX_OFF 12
>>> #define CMPLT_HDR_ERX_MSK (0x1 << CMPLT_HDR_ERX_OFF)
>>> +#define CMPLT_HDR_ABORT_STAT_OFF 13
>>> +#define CMPLT_HDR_ABORT_STAT_MSK (0x7 << CMPLT_HDR_ABORT_STAT_OFF)
>>> +/* abort_stat */
>>> +#define STAT_IO_NOT_VALID 0x1
>>> +#define STAT_IO_NO_DEVICE 0x2
>>> +#define STAT_IO_COMPLETE 0x3
>>> +#define STAT_IO_ABORTED 0x4
>>> /* dw1 */
>>> #define CMPLT_HDR_IPTT_OFF 0
>>> #define CMPLT_HDR_IPTT_MSK (0xffff << CMPLT_HDR_IPTT_OFF)
>>> @@ -1569,6 +1576,30 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba,
>>> struct hisi_sas_slot *slot,
>>> goto out;
>>> }
>>>
>>> + /* Use SAS+TMF status codes */
>>> + switch ((complete_hdr->dw0 & CMPLT_HDR_ABORT_STAT_MSK)
>>> + >> CMPLT_HDR_ABORT_STAT_OFF) {
>>> + case STAT_IO_ABORTED:
>>> + /* this io has been aborted by abort command */
>>> + ts->stat = SAS_ABORTED_TASK;
>>> + goto out;
>>> + case STAT_IO_COMPLETE:
>>> + /* internal abort command complete */
>>> + ts->stat = TMF_RESP_FUNC_COMPLETE;
>>> + goto out;
>>> + case STAT_IO_NO_DEVICE:
>>> + ts->stat = TMF_RESP_FUNC_COMPLETE;
>>> + goto out;
>>> + case STAT_IO_NOT_VALID:
>>> + /* abort single io, controller don't find
>>> + * the io need to abort
>>> + */
>>> + ts->stat = TMF_RESP_FUNC_FAILED;
>>> + goto out;
>> Hmm. This will cause the SCSI EH to kick in.
>> And then, according to the description abort has succeeded, it's just
>> that for some reason the associated command couldn't be found.
>> So couldn't this be due to a race condition, and the command has in fact
>> been aborted correctly (and the code is just too slow acknowledging it)?
>>
>
> Hi Hannes,
>
> I'm not sure I fully get your question.
>
> The internal abort would happen from the SCSI error handling. An example
> would be when the disk was not safely removed and some IO is still in
> flight. In this case the IO will timeout, SCSI EH starts, and we try to
> abort the command in LLDD, by TMF (which would fail) and internal abort.
>
> For internal abort, if the abort command succeeds then 2 things happen:
> - abort task completes with status STAT_IO_COMPLETE
> - task which was aborted completes with status STAT_IO_ABORTED
>
> If the command does not abort successfully then:
> - abort task completes with status STAT_IO_NOT_VALID
> - task which we wanted to be aborted does not complete and is probably
> still in the slave device
>
> I hope that this makes it clear.
>
Right, that answers it.
Reviewed-by: Hannes Reinecke <hare@...e.com>
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@...e.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
Powered by blists - more mailing lists