linux-kernel - Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <548b602daa1e15415625cb8d1f81a208@codeaurora.org>
Date:   Sat, 01 Aug 2020 07:17:08 +0800
From:   Can Guo <cang@...eaurora.org>
To:     Bart Van Assche <bvanassche@....org>
Cc:     Stanley Chu <stanley.chu@...iatek.com>,
        Avri Altman <Avri.Altman@....com>, linux-scsi@...r.kernel.org,
        martin.petersen@...cle.com, alim.akhtar@...sung.com,
        jejb@...ux.ibm.com, beanhuo@...ron.com, asutoshd@...eaurora.org,
        matthias.bgg@...il.com, linux-mediatek@...ts.infradead.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        kuohong.wang@...iatek.com, peter.wang@...iatek.com,
        chun-hung.wu@...iatek.com, andy.teng@...iatek.com,
        chaotian.jing@...iatek.com, cc.chou@...iatek.com
Subject: Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt
 notification

Hi Bart,

On 2020-08-01 00:51, Bart Van Assche wrote:
> On 2020-07-31 01:00, Can Guo wrote:
>> AFAIK, sychronization of scsi_done is not a problem here, because scsi
>> layer
>> use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to 
>> prevent
>> the concurrency of abort and real completion of it.
>> 
>> Check func scsi_times_out(), hope it helps.
>> 
>> enum blk_eh_timer_return scsi_times_out(struct request *req)
>> {
>> ...
>>         if (rtn == BLK_EH_DONE) {
>>                 /*
>>                  * Set the command to complete first in order to 
>> prevent
>> a real
>>                  * completion from releasing the command while error
>> handling
>>                  * is using it. If the command was already completed,
>> then the
>>                  * lower level driver beat the timeout handler, and it
>> is safe
>>                  * to return without escalating error recovery.
>>                  *
>>                  * If timeout handling lost the race to a real
>> completion, the
>>                  * block layer may ignore that due to a fake timeout
>> injection,
>>                  * so return RESET_TIMER to allow error handling 
>> another
>> shot
>>                  * at this command.
>>                  */
>>                 if (test_and_set_bit(SCMD_STATE_COMPLETE, 
>> &scmd->state))
>>                         return BLK_EH_RESET_TIMER;
>>                 if (scsi_abort_command(scmd) != SUCCESS) {
>>                         set_host_byte(scmd, DID_TIME_OUT);
>>                         scsi_eh_scmd_add(scmd);
>>                 }
>>         }
>> }
> 
> I am familiar with this mechanism. My concern is that both the regular
> completion path and the abort handler must call scsi_dma_unmap() before
> calling cmd->scsi_done(cmd). I don't see how
> test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state) could prevent that
> the regular completion path and the abort handler call scsi_dma_unmap()
> concurrently since both calls happen before the SCMD_STATE_COMPLETE bit
> is set?
> 
> Thanks,
> 
> Bart.

For scsi_dma_unmap() part, that is true - we should make it serialized 
with
any other completion paths. I've found it during my fault injection 
test, so
I've made a patch to fix it, but it only comes in my next error recovery
enhancement patch series. Please check the attachment.

Thanks,

Can Guo.


View attachment "0005-scsi-ufs-Properly-release-resources-if-a-task-is-abo.patch" of type "text/x-diff" (1473 bytes)