[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <775f75db-edb8-8f39-2592-862756811710@oracle.com>
Date: Sun, 3 Apr 2022 11:58:59 -0500
From: Mike Christie <michael.christie@...cle.com>
To: Wenchao Hao <haowenchao@...wei.com>,
"James E . J . Bottomley" <jejb@...ux.ibm.com>,
"Martin K . Petersen" <martin.petersen@...cle.com>,
Hannes Reinecke <hare@...e.de>, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: linfeilong@...wei.com
Subject: Re: [PATCH] scsi: core: always finish successfully aborted notry
command
On 3/31/22 10:50 PM, Wenchao Hao wrote:
> If the abort command succeed and it does not need to be retired, do not add
> it to error handle list.
>
> Adding command to error handle list is an annoying flow which would stop I/O
> of all LUNs which shared a same HBA.
>
> So here if we successfully abort a command, we can finish it via
> scsi_finish_command() which would reduce time spent on scsi_error_handler()
>
> Signed-off-by: Wenchao Hao <haowenchao@...wei.com>
> ---
> drivers/scsi/scsi_error.c | 55 +++++++++++++++++++++------------------
> 1 file changed, 29 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index cdaca13ac1f1..15299603b7ee 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -173,41 +173,44 @@ scmd_eh_abort_handler(struct work_struct *work)
> goto out;
> }
> set_host_byte(scmd, DID_TIME_OUT);
> - if (scsi_host_eh_past_deadline(shost)) {
> - SCSI_LOG_ERROR_RECOVERY(3,
> - scmd_printk(KERN_INFO, scmd,
> - "eh timeout, not retrying "
> - "aborted command\n"));
> - goto out;
> - }
>
> - spin_lock_irqsave(shost->host_lock, flags);
> - list_del_init(&scmd->eh_entry);
> + if (scsi_noretry_cmd(scmd) ||
> + !scsi_cmd_retry_allowed(scmd) &&
> + !scsi_eh_should_retry_cmd(scmd)) {
I don't think this test is correct. Did you want all ||s?
For what the patch is trying to accomplish I'm not sure if it's
the behavior people wanted. Do all drivers have configurable
abort timeouts? If an abort takes N minutes to complete ok,
and that's put us over the eh deadline maybe the user wanted
that device to be offlined.
Powered by blists - more mailing lists