linux-kernel - [PATCH v2] scsi: ufs: Cleanup completed request without interrupt notification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <20200702041850.28028-1-stanley.chu@mediatek.com>
Date:   Thu, 2 Jul 2020 12:18:50 +0800
From:   Stanley Chu <stanley.chu@...iatek.com>
To:     <linux-scsi@...r.kernel.org>, <martin.petersen@...cle.com>,
        <avri.altman@....com>, <alim.akhtar@...sung.com>,
        <jejb@...ux.ibm.com>, <bvanassche@....org>
CC:     <beanhuo@...ron.com>, <asutoshd@...eaurora.org>,
        <cang@...eaurora.org>, <matthias.bgg@...il.com>,
        <linux-mediatek@...ts.infradead.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-kernel@...r.kernel.org>, <kuohong.wang@...iatek.com>,
        <peter.wang@...iatek.com>, <chun-hung.wu@...iatek.com>,
        <andy.teng@...iatek.com>, <chaotian.jing@...iatek.com>,
        <cc.chou@...iatek.com>, Stanley Chu <stanley.chu@...iatek.com>
Subject: [PATCH v2] scsi: ufs: Cleanup completed request without interrupt notification

If somehow no interrupt notification is raised for a completed request
and its doorbell bit is cleared by host, UFS driver needs to cleanup
its outstanding bit in ufshcd_abort().

Otherwise, system may crash by below abnormal flow:

After this request is requeued by SCSI layer with its
outstanding bit set, the next completed request will trigger
ufshcd_transfer_req_compl() to handle all "completed outstanding
bits". In this time, the "abnormal outstanding bit" will be detected
and the "requeued request" will be chosen to execute request
post-processing flow. This is wrong and blk_finish_request() will
BUG_ON because this request is still "alive".

It is worth mentioning that before ufshcd_abort() cleans the timed-out
request, driver needs to check again if this request is really not
handled by __ufshcd_transfer_req_compl() yet because it is possible
that its interrupt comes very lately before the cleaning.

Signed-off-by: Stanley Chu <stanley.chu@...iatek.com>
---
 drivers/scsi/ufs/ufshcd.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index cadfa9006972..0f4f3255e403 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6462,7 +6462,7 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
 			/* command completed already */
 			dev_err(hba->dev, "%s: cmd at tag %d successfully cleared from DB.\n",
 				__func__, tag);
-			goto out;
+			goto cleanup;
 		} else {
 			dev_err(hba->dev,
 				"%s: no response from device. tag = %d, err %d\n",
@@ -6496,9 +6496,14 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
 		goto out;
 	}

+cleanup:
+	spin_lock_irqsave(host->host_lock, flags);
+	if (!test_bit(tag, hba->outstanding_reqs)) {
+		spin_unlock_irqrestore(host->host_lock, flags);
+		goto out;
+	}
 	scsi_dma_unmap(cmd);

-	spin_lock_irqsave(host->host_lock, flags);
 	ufshcd_outstanding_req_clear(hba, tag);
 	hba->lrb[tag].cmd = NULL;
 	spin_unlock_irqrestore(host->host_lock, flags);
-- 
2.18.0