linux-kernel - Re: [DMARC Error]Re: [PATCH 10/15] scsi: qla2xxx: fix TMR failure handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7e746ee3-6c51-4006-a5cf-17ca527be71b@cybernetics.com>
Date: Wed, 17 Sep 2025 16:38:22 -0400
From: Tony Battersby <tonyb@...ernetics.com>
To: Dmitry Bogdanov <d.bogdanov@...ro.com>
Cc: Nilesh Javali <njavali@...vell.com>,
 GR-QLogic-Storage-Upstream@...vell.com,
 "James E.J. Bottomley" <James.Bottomley@...senPartnership.com>,
 "Martin K. Petersen" <martin.petersen@...cle.com>,
 linux-scsi <linux-scsi@...r.kernel.org>, target-devel@...r.kernel.org,
 scst-devel@...ts.sourceforge.net,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [DMARC Error]Re: [PATCH 10/15] scsi: qla2xxx: fix TMR failure
 handling

On 9/17/25 09:06, Dmitry Bogdanov wrote:
> On Tue, Sep 16, 2025 at 12:04:11PM -0400, Tony Battersby wrote:
>> On 9/12/25 10:36, Dmitry Bogdanov wrote:
>>> On Mon, Sep 08, 2025 at 03:02:49PM -0400, Tony Battersby wrote:
>>>> - Calling mempool_free() directly can lead to memory-use-after-free.
>>> No, it is a API contract between modules. If handle_tmr returned an error,
>>> then the caller of handle_tmr is responsible to make a cleanup.
>>> Otherwise, target module (tcm_qla2xxx) is responsible. The same rule is
>>> for handle_cmd.
>>>> +               qlt_xmit_tm_rsp(mcmd);
>>> qlt_xmit_tm_rsp does not free mcmd for TMF ABORT. So you introduce a memleak.
>> I just tested it, and there is no memleak.  qlt_build_abts_resp_iocb()
>> sets req->outstanding_cmds[h] to mcmd, and then
>> qlt_handle_abts_completion() calls ->free_mcmd after getting a response
>> from the ISP.
> ha->tgt.tgt_ops->free_mcmd(mcmd) can be called when mcmd was handled by the
> target core, i.e. handle_tmr() returned 0. LIO's tcm_qla2xxx_free_mcmd calls
> target core function transport_generic_free_cmd to clear core's command
> object. Which in turns calls eventually mempool_free in qla2xxx.
> But when handle_tmr returned an error then there is no that core's
> command object and LIO will not free the mcmd. That is how a memleak
> will happen.

OK, I see the problem now.  Thanks for pointing that out.

>> The original code had a memory-use-after-free by calling
>> qlt_build_abts_resp_iocb() and then mempool_free(), and
>> then qlt_handle_abts_completion() used the freed mcmd.  I can reword the
>> commit message to make this clearer.
> BTW, the easyest way to just to fix that use-after-free is to use
> qlt_24xx_send_abts_resp instead of qlt_build_abts_resp_iocb like in other
> places, where there is no mcmd and there is no need to wait the
> completion from FW.
>
> We use this patch:
>
>
> From: Dmitry Bogdanov <d.bogdanov@...ro.com>
> Date: Thu, 27 Apr 2023 15:41:55 +0300
> Subject: [PATCH] scsi: qla2xxx: fix use-after-free on ABTS_RESP sending
>
> If an abort was rejected by LIO Core qla2xxx sends ABTS_RESP with rejected
> status. But it does not save mcmd pointer which is freed right after sending
> the response to FW. It leads to use-after-free at the completion from FW.
>
> Use a correct function to send ABTS_RESP when the abort is rejected by LIO Core.
>
> Signed-off-by: Dmitry Bogdanov <d.bogdanov@...ro.com>
> ---
>  drivers/scsi/qla2xxx/qla_target.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> index b9d816b8341e..43eafdc765a3 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -2162,7 +2162,9 @@ static void qlt_do_tmr_work(struct work_struct *work)
>  		switch (mcmd->tmr_func) {
>  		case QLA_TGT_ABTS:
>  			mcmd->fc_tm_rsp = FCP_TMF_REJECTED;
> -			qlt_build_abts_resp_iocb(mcmd);
> +			qlt_24xx_send_abts_resp(mcmd->qpair,
> +						&mcmd->orig_iocb.abts,
> +						FCP_TMF_REJECTED, false);
>  			break;
>  		case QLA_TGT_LUN_RESET:
>  		case QLA_TGT_CLEAR_TS:


This approach could work.  But the handle_tmr() error path is already
significantly different than qlt_xmit_tm_rsp(); I assume because
qlt_xmit_tm_rsp() is well-maintained, whereas the handle_tmr() error
path is not.  So I think it makes more sense to make qlt_xmit_tm_rsp()
work for both cases.  Here is my proposed v2 patch:

Changes since v1:
- Change FCP_TMF_REJECTED to FCP_TMF_FAILED.
- Add QLA24XX_MGMT_LLD_OWNED and qlt_free_ul_mcmd().
- Improve patch description.

From: Tony Battersby <tonyb@...ernetics.com>
Date: Wed, 17 Sep 2025 15:59:23 -0400
Subject: [PATCH] scsi: qla2xxx: fix TMR failure handling

(target mode)

If handle_tmr() fails:

- The code for QLA_TGT_ABTS results in memory-use-after-free and
  double-free:
	qlt_do_tmr_work()
		qlt_build_abts_resp_iocb()
			qpair->req->outstanding_cmds[h] = (srb_t *)mcmd;
		mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool); FIRST FREE
	qlt_handle_abts_completion()
		mcmd = qlt_ctio_to_cmd()
			cmd = req->outstanding_cmds[h];
			return cmd;
		vha  = mcmd->vha; USE-AFTER-FREE
		ha->tgt.tgt_ops->free_mcmd(mcmd); SECOND FREE

- qlt_send_busy() makes no sense because it sends a SCSI command
  response instead of a TMR response.

Instead just call qlt_xmit_tm_rsp() to send a TMR failed response,
since that code is well-tested and handles a number of corner cases.
But it would be incorrect to call ha->tgt.tgt_ops->free_mcmd() after
handle_tmr() failed, so add a flag to mcmd indicating the proper way to
free the mcmd so that qlt_xmit_tm_rsp() can be used for both cases.

Signed-off-by: Tony Battersby <tonyb@...ernetics.com>
---
 drivers/scsi/qla2xxx/qla_target.c | 54 +++++++++++++------------------
 drivers/scsi/qla2xxx/qla_target.h |  2 ++
 2 files changed, 24 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index 1e81582085e3..f73ecf680f52 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -2017,7 +2017,6 @@ static void qlt_do_tmr_work(struct work_struct *work)
 	struct qla_hw_data *ha = mcmd->vha->hw;
 	int rc;
 	uint32_t tag;
-	unsigned long flags;
 
 	switch (mcmd->tmr_func) {
 	case QLA_TGT_ABTS:
@@ -2032,34 +2031,12 @@ static void qlt_do_tmr_work(struct work_struct *work)
 	    mcmd->tmr_func, tag);
 
 	if (rc != 0) {
-		spin_lock_irqsave(mcmd->qpair->qp_lock_ptr, flags);
-		switch (mcmd->tmr_func) {
-		case QLA_TGT_ABTS:
-			mcmd->fc_tm_rsp = FCP_TMF_REJECTED;
-			qlt_build_abts_resp_iocb(mcmd);
-			break;
-		case QLA_TGT_LUN_RESET:
-		case QLA_TGT_CLEAR_TS:
-		case QLA_TGT_ABORT_TS:
-		case QLA_TGT_CLEAR_ACA:
-		case QLA_TGT_TARGET_RESET:
-			qlt_send_busy(mcmd->qpair, &mcmd->orig_iocb.atio,
-			    qla_sam_status);
-			break;
-
-		case QLA_TGT_ABORT_ALL:
-		case QLA_TGT_NEXUS_LOSS_SESS:
-		case QLA_TGT_NEXUS_LOSS:
-			qlt_send_notify_ack(mcmd->qpair,
-			    &mcmd->orig_iocb.imm_ntfy, 0, 0, 0, 0, 0, 0);
-			break;
-		}
-		spin_unlock_irqrestore(mcmd->qpair->qp_lock_ptr, flags);
-
 		ql_dbg(ql_dbg_tgt_mgt, mcmd->vha, 0xf052,
 		    "qla_target(%d):  tgt_ops->handle_tmr() failed: %d\n",
 		    mcmd->vha->vp_idx, rc);
-		mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool);
+		mcmd->flags |= QLA24XX_MGMT_LLD_OWNED;
+		mcmd->fc_tm_rsp = FCP_TMF_FAILED;
+		qlt_xmit_tm_rsp(mcmd);
 	}
 }
 
@@ -2246,6 +2223,19 @@ void qlt_free_mcmd(struct qla_tgt_mgmt_cmd *mcmd)
 }
 EXPORT_SYMBOL(qlt_free_mcmd);
 
+/*
+ * If the upper layer knows about this mgmt cmd, then call its ->free_cmd()
+ * callback, which will eventually call qlt_free_mcmd().  Otherwise, call
+ * qlt_free_mcmd() directly.
+ */
+void qlt_free_ul_mcmd(struct qla_hw_data *ha, struct qla_tgt_mgmt_cmd *mcmd)
+{
+	if (mcmd->flags & QLA24XX_MGMT_LLD_OWNED)
+		qlt_free_mcmd(mcmd);
+	else
+		ha->tgt.tgt_ops->free_mcmd(mcmd);
+}
+
 /*
  * ha->hardware_lock supposed to be held on entry. Might drop it, then
  * reacquire
@@ -2338,12 +2328,12 @@ void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *mcmd)
 			"RESET-TMR online/active/old-count/new-count = %d/%d/%d/%d.\n",
 			vha->flags.online, qla2x00_reset_active(vha),
 			mcmd->reset_count, qpair->chip_reset);
-		ha->tgt.tgt_ops->free_mcmd(mcmd);
+		qlt_free_ul_mcmd(ha, mcmd);
 		spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
 		return;
 	}
 
-	if (mcmd->flags == QLA24XX_MGMT_SEND_NACK) {
+	if (mcmd->flags & QLA24XX_MGMT_SEND_NACK) {
 		switch (mcmd->orig_iocb.imm_ntfy.u.isp24.status_subcode) {
 		case ELS_LOGO:
 		case ELS_PRLO:
@@ -2376,7 +2366,7 @@ void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *mcmd)
 	 * qlt_xmit_tm_rsp() returns here..
 	 */
 	if (free_mcmd)
-		ha->tgt.tgt_ops->free_mcmd(mcmd);
+		qlt_free_ul_mcmd(ha, mcmd);
 
 	spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
 }
@@ -5717,7 +5707,7 @@ static void qlt_handle_abts_completion(struct scsi_qla_host *vha,
 		if (le32_to_cpu(entry->error_subcode1) == 0x1E &&
 		    le32_to_cpu(entry->error_subcode2) == 0) {
 			if (qlt_chk_unresolv_exchg(vha, rsp->qpair, entry)) {
-				ha->tgt.tgt_ops->free_mcmd(mcmd);
+				qlt_free_ul_mcmd(ha, mcmd);
 				return;
 			}
 			qlt_24xx_retry_term_exchange(vha, rsp->qpair,
@@ -5728,10 +5718,10 @@ static void qlt_handle_abts_completion(struct scsi_qla_host *vha,
 			    vha->vp_idx, entry->compl_status,
 			    entry->error_subcode1,
 			    entry->error_subcode2);
-			ha->tgt.tgt_ops->free_mcmd(mcmd);
+			qlt_free_ul_mcmd(ha, mcmd);
 		}
 	} else if (mcmd) {
-		ha->tgt.tgt_ops->free_mcmd(mcmd);
+		qlt_free_ul_mcmd(ha, mcmd);
 	}
 }
 
diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h
index 15a59c125c53..73432be21461 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -965,6 +965,7 @@ struct qla_tgt_mgmt_cmd {
 	unsigned int flags;
 #define QLA24XX_MGMT_SEND_NACK	BIT_0
 #define QLA24XX_MGMT_ABORT_IO_ATTR_VALID BIT_1
+#define QLA24XX_MGMT_LLD_OWNED	BIT_2
 	uint32_t reset_count;
 	struct work_struct work;
 	uint64_t unpacked_lun;
@@ -1056,6 +1057,7 @@ extern int qlt_rdy_to_xfer(struct qla_tgt_cmd *);
 extern int qlt_xmit_response(struct qla_tgt_cmd *, int, uint8_t);
 extern int qlt_abort_cmd(struct qla_tgt_cmd *);
 extern void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *);
+void qlt_free_ul_mcmd(struct qla_hw_data *ha, struct qla_tgt_mgmt_cmd *mcmd);
 extern void qlt_free_mcmd(struct qla_tgt_mgmt_cmd *);
 extern void qlt_free_cmd(struct qla_tgt_cmd *cmd);
 extern void qlt_async_event(uint16_t, struct scsi_qla_host *, uint16_t *);
-- 
2.43.0