linux-kernel - Re: [PATCH] scsi: core: always finish successfully aborted notry command

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <275178ca-1fe4-b00b-c1f4-c8c34e7f4686@huawei.com>
Date:   Wed, 6 Apr 2022 14:56:32 +0800
From:   Wenchao Hao <haowenchao@...wei.com>
To:     Mike Christie <michael.christie@...cle.com>,
        "James E . J . Bottomley" <jejb@...ux.ibm.com>,
        "Martin K . Petersen" <martin.petersen@...cle.com>,
        Hannes Reinecke <hare@...e.de>, <linux-scsi@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
CC:     <linfeilong@...wei.com>
Subject: Re: [PATCH] scsi: core: always finish successfully aborted notry
 command

On 2022/4/4 0:58, Mike Christie wrote:
> On 3/31/22 10:50 PM, Wenchao Hao wrote:
>> If the abort command succeed and it does not need to be retired, do not add
>> it to error handle list. 
>>
>> Adding command to error handle list is an annoying flow which would stop I/O
>> of all LUNs which shared a same HBA.
>>
>> So here if we successfully abort a command, we can finish it via
>> scsi_finish_command() which would reduce time spent on scsi_error_handler()
>>
>> Signed-off-by: Wenchao Hao <haowenchao@...wei.com>
>> ---
>>  drivers/scsi/scsi_error.c | 55 +++++++++++++++++++++------------------
>>  1 file changed, 29 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
>> index cdaca13ac1f1..15299603b7ee 100644
>> --- a/drivers/scsi/scsi_error.c
>> +++ b/drivers/scsi/scsi_error.c
>> @@ -173,41 +173,44 @@ scmd_eh_abort_handler(struct work_struct *work)
>>  		goto out;
>>  	}
>>  	set_host_byte(scmd, DID_TIME_OUT);
>> -	if (scsi_host_eh_past_deadline(shost)) {
>> -		SCSI_LOG_ERROR_RECOVERY(3,
>> -			scmd_printk(KERN_INFO, scmd,
>> -				    "eh timeout, not retrying "
>> -				    "aborted command\n"));
>> -		goto out;
>> -	}
>>  
>> -	spin_lock_irqsave(shost->host_lock, flags);
>> -	list_del_init(&scmd->eh_entry);
>> +	if (scsi_noretry_cmd(scmd) ||
>> +	    !scsi_cmd_retry_allowed(scmd) &&
>> +	    !scsi_eh_should_retry_cmd(scmd)) {
> 
> 
> I don't think this test is correct. Did you want all ||s?

Sorry, I made a mistake, it should be "||" here.

> 
> For what the patch is trying to accomplish I'm not sure if it's
> the behavior people wanted. Do all drivers have configurable
> abort timeouts? If an abort takes N minutes to complete ok,
> and that's put us over the eh deadline maybe the user wanted
> that device to be offlined.
> .
> 

I think the changes would not affect the timeouts and deadline
because the logic changed is after command abort successfully.

We check if the command need to be retired after aborting command
successfully, if it does not need to retry, just finish it (only
commands aborted failed should be add to error handle list).

I submit this change because we want to reduce the time spent on 
scsi_error_handler().

Do you think the change is OK? Except previous wrong "&&" which should
be "||".

Thanks, looking forward to hearing from you.