[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7e746ee3-6c51-4006-a5cf-17ca527be71b@cybernetics.com>
Date: Wed, 17 Sep 2025 16:38:22 -0400
From: Tony Battersby <tonyb@...ernetics.com>
To: Dmitry Bogdanov <d.bogdanov@...ro.com>
Cc: Nilesh Javali <njavali@...vell.com>,
GR-QLogic-Storage-Upstream@...vell.com,
"James E.J. Bottomley" <James.Bottomley@...senPartnership.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
linux-scsi <linux-scsi@...r.kernel.org>, target-devel@...r.kernel.org,
scst-devel@...ts.sourceforge.net,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [DMARC Error]Re: [PATCH 10/15] scsi: qla2xxx: fix TMR failure
handling
On 9/17/25 09:06, Dmitry Bogdanov wrote:
> On Tue, Sep 16, 2025 at 12:04:11PM -0400, Tony Battersby wrote:
>> On 9/12/25 10:36, Dmitry Bogdanov wrote:
>>> On Mon, Sep 08, 2025 at 03:02:49PM -0400, Tony Battersby wrote:
>>>> - Calling mempool_free() directly can lead to memory-use-after-free.
>>> No, it is a API contract between modules. If handle_tmr returned an error,
>>> then the caller of handle_tmr is responsible to make a cleanup.
>>> Otherwise, target module (tcm_qla2xxx) is responsible. The same rule is
>>> for handle_cmd.
>>>> + qlt_xmit_tm_rsp(mcmd);
>>> qlt_xmit_tm_rsp does not free mcmd for TMF ABORT. So you introduce a memleak.
>> I just tested it, and there is no memleak. qlt_build_abts_resp_iocb()
>> sets req->outstanding_cmds[h] to mcmd, and then
>> qlt_handle_abts_completion() calls ->free_mcmd after getting a response
>> from the ISP.
> ha->tgt.tgt_ops->free_mcmd(mcmd) can be called when mcmd was handled by the
> target core, i.e. handle_tmr() returned 0. LIO's tcm_qla2xxx_free_mcmd calls
> target core function transport_generic_free_cmd to clear core's command
> object. Which in turns calls eventually mempool_free in qla2xxx.
> But when handle_tmr returned an error then there is no that core's
> command object and LIO will not free the mcmd. That is how a memleak
> will happen.
OK, I see the problem now. Thanks for pointing that out.
>> The original code had a memory-use-after-free by calling
>> qlt_build_abts_resp_iocb() and then mempool_free(), and
>> then qlt_handle_abts_completion() used the freed mcmd. I can reword the
>> commit message to make this clearer.
> BTW, the easyest way to just to fix that use-after-free is to use
> qlt_24xx_send_abts_resp instead of qlt_build_abts_resp_iocb like in other
> places, where there is no mcmd and there is no need to wait the
> completion from FW.
>
> We use this patch:
>
>
> From: Dmitry Bogdanov <d.bogdanov@...ro.com>
> Date: Thu, 27 Apr 2023 15:41:55 +0300
> Subject: [PATCH] scsi: qla2xxx: fix use-after-free on ABTS_RESP sending
>
> If an abort was rejected by LIO Core qla2xxx sends ABTS_RESP with rejected
> status. But it does not save mcmd pointer which is freed right after sending
> the response to FW. It leads to use-after-free at the completion from FW.
>
> Use a correct function to send ABTS_RESP when the abort is rejected by LIO Core.
>
> Signed-off-by: Dmitry Bogdanov <d.bogdanov@...ro.com>
> ---
> drivers/scsi/qla2xxx/qla_target.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
> index b9d816b8341e..43eafdc765a3 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -2162,7 +2162,9 @@ static void qlt_do_tmr_work(struct work_struct *work)
> switch (mcmd->tmr_func) {
> case QLA_TGT_ABTS:
> mcmd->fc_tm_rsp = FCP_TMF_REJECTED;
> - qlt_build_abts_resp_iocb(mcmd);
> + qlt_24xx_send_abts_resp(mcmd->qpair,
> + &mcmd->orig_iocb.abts,
> + FCP_TMF_REJECTED, false);
> break;
> case QLA_TGT_LUN_RESET:
> case QLA_TGT_CLEAR_TS:
This approach could work. But the handle_tmr() error path is already
significantly different than qlt_xmit_tm_rsp(); I assume because
qlt_xmit_tm_rsp() is well-maintained, whereas the handle_tmr() error
path is not. So I think it makes more sense to make qlt_xmit_tm_rsp()
work for both cases. Here is my proposed v2 patch:
Changes since v1:
- Change FCP_TMF_REJECTED to FCP_TMF_FAILED.
- Add QLA24XX_MGMT_LLD_OWNED and qlt_free_ul_mcmd().
- Improve patch description.
From: Tony Battersby <tonyb@...ernetics.com>
Date: Wed, 17 Sep 2025 15:59:23 -0400
Subject: [PATCH] scsi: qla2xxx: fix TMR failure handling
(target mode)
If handle_tmr() fails:
- The code for QLA_TGT_ABTS results in memory-use-after-free and
double-free:
qlt_do_tmr_work()
qlt_build_abts_resp_iocb()
qpair->req->outstanding_cmds[h] = (srb_t *)mcmd;
mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool); FIRST FREE
qlt_handle_abts_completion()
mcmd = qlt_ctio_to_cmd()
cmd = req->outstanding_cmds[h];
return cmd;
vha = mcmd->vha; USE-AFTER-FREE
ha->tgt.tgt_ops->free_mcmd(mcmd); SECOND FREE
- qlt_send_busy() makes no sense because it sends a SCSI command
response instead of a TMR response.
Instead just call qlt_xmit_tm_rsp() to send a TMR failed response,
since that code is well-tested and handles a number of corner cases.
But it would be incorrect to call ha->tgt.tgt_ops->free_mcmd() after
handle_tmr() failed, so add a flag to mcmd indicating the proper way to
free the mcmd so that qlt_xmit_tm_rsp() can be used for both cases.
Signed-off-by: Tony Battersby <tonyb@...ernetics.com>
---
drivers/scsi/qla2xxx/qla_target.c | 54 +++++++++++++------------------
drivers/scsi/qla2xxx/qla_target.h | 2 ++
2 files changed, 24 insertions(+), 32 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index 1e81582085e3..f73ecf680f52 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -2017,7 +2017,6 @@ static void qlt_do_tmr_work(struct work_struct *work)
struct qla_hw_data *ha = mcmd->vha->hw;
int rc;
uint32_t tag;
- unsigned long flags;
switch (mcmd->tmr_func) {
case QLA_TGT_ABTS:
@@ -2032,34 +2031,12 @@ static void qlt_do_tmr_work(struct work_struct *work)
mcmd->tmr_func, tag);
if (rc != 0) {
- spin_lock_irqsave(mcmd->qpair->qp_lock_ptr, flags);
- switch (mcmd->tmr_func) {
- case QLA_TGT_ABTS:
- mcmd->fc_tm_rsp = FCP_TMF_REJECTED;
- qlt_build_abts_resp_iocb(mcmd);
- break;
- case QLA_TGT_LUN_RESET:
- case QLA_TGT_CLEAR_TS:
- case QLA_TGT_ABORT_TS:
- case QLA_TGT_CLEAR_ACA:
- case QLA_TGT_TARGET_RESET:
- qlt_send_busy(mcmd->qpair, &mcmd->orig_iocb.atio,
- qla_sam_status);
- break;
-
- case QLA_TGT_ABORT_ALL:
- case QLA_TGT_NEXUS_LOSS_SESS:
- case QLA_TGT_NEXUS_LOSS:
- qlt_send_notify_ack(mcmd->qpair,
- &mcmd->orig_iocb.imm_ntfy, 0, 0, 0, 0, 0, 0);
- break;
- }
- spin_unlock_irqrestore(mcmd->qpair->qp_lock_ptr, flags);
-
ql_dbg(ql_dbg_tgt_mgt, mcmd->vha, 0xf052,
"qla_target(%d): tgt_ops->handle_tmr() failed: %d\n",
mcmd->vha->vp_idx, rc);
- mempool_free(mcmd, qla_tgt_mgmt_cmd_mempool);
+ mcmd->flags |= QLA24XX_MGMT_LLD_OWNED;
+ mcmd->fc_tm_rsp = FCP_TMF_FAILED;
+ qlt_xmit_tm_rsp(mcmd);
}
}
@@ -2246,6 +2223,19 @@ void qlt_free_mcmd(struct qla_tgt_mgmt_cmd *mcmd)
}
EXPORT_SYMBOL(qlt_free_mcmd);
+/*
+ * If the upper layer knows about this mgmt cmd, then call its ->free_cmd()
+ * callback, which will eventually call qlt_free_mcmd(). Otherwise, call
+ * qlt_free_mcmd() directly.
+ */
+void qlt_free_ul_mcmd(struct qla_hw_data *ha, struct qla_tgt_mgmt_cmd *mcmd)
+{
+ if (mcmd->flags & QLA24XX_MGMT_LLD_OWNED)
+ qlt_free_mcmd(mcmd);
+ else
+ ha->tgt.tgt_ops->free_mcmd(mcmd);
+}
+
/*
* ha->hardware_lock supposed to be held on entry. Might drop it, then
* reacquire
@@ -2338,12 +2328,12 @@ void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *mcmd)
"RESET-TMR online/active/old-count/new-count = %d/%d/%d/%d.\n",
vha->flags.online, qla2x00_reset_active(vha),
mcmd->reset_count, qpair->chip_reset);
- ha->tgt.tgt_ops->free_mcmd(mcmd);
+ qlt_free_ul_mcmd(ha, mcmd);
spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
return;
}
- if (mcmd->flags == QLA24XX_MGMT_SEND_NACK) {
+ if (mcmd->flags & QLA24XX_MGMT_SEND_NACK) {
switch (mcmd->orig_iocb.imm_ntfy.u.isp24.status_subcode) {
case ELS_LOGO:
case ELS_PRLO:
@@ -2376,7 +2366,7 @@ void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *mcmd)
* qlt_xmit_tm_rsp() returns here..
*/
if (free_mcmd)
- ha->tgt.tgt_ops->free_mcmd(mcmd);
+ qlt_free_ul_mcmd(ha, mcmd);
spin_unlock_irqrestore(qpair->qp_lock_ptr, flags);
}
@@ -5717,7 +5707,7 @@ static void qlt_handle_abts_completion(struct scsi_qla_host *vha,
if (le32_to_cpu(entry->error_subcode1) == 0x1E &&
le32_to_cpu(entry->error_subcode2) == 0) {
if (qlt_chk_unresolv_exchg(vha, rsp->qpair, entry)) {
- ha->tgt.tgt_ops->free_mcmd(mcmd);
+ qlt_free_ul_mcmd(ha, mcmd);
return;
}
qlt_24xx_retry_term_exchange(vha, rsp->qpair,
@@ -5728,10 +5718,10 @@ static void qlt_handle_abts_completion(struct scsi_qla_host *vha,
vha->vp_idx, entry->compl_status,
entry->error_subcode1,
entry->error_subcode2);
- ha->tgt.tgt_ops->free_mcmd(mcmd);
+ qlt_free_ul_mcmd(ha, mcmd);
}
} else if (mcmd) {
- ha->tgt.tgt_ops->free_mcmd(mcmd);
+ qlt_free_ul_mcmd(ha, mcmd);
}
}
diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h
index 15a59c125c53..73432be21461 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -965,6 +965,7 @@ struct qla_tgt_mgmt_cmd {
unsigned int flags;
#define QLA24XX_MGMT_SEND_NACK BIT_0
#define QLA24XX_MGMT_ABORT_IO_ATTR_VALID BIT_1
+#define QLA24XX_MGMT_LLD_OWNED BIT_2
uint32_t reset_count;
struct work_struct work;
uint64_t unpacked_lun;
@@ -1056,6 +1057,7 @@ extern int qlt_rdy_to_xfer(struct qla_tgt_cmd *);
extern int qlt_xmit_response(struct qla_tgt_cmd *, int, uint8_t);
extern int qlt_abort_cmd(struct qla_tgt_cmd *);
extern void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *);
+void qlt_free_ul_mcmd(struct qla_hw_data *ha, struct qla_tgt_mgmt_cmd *mcmd);
extern void qlt_free_mcmd(struct qla_tgt_mgmt_cmd *);
extern void qlt_free_cmd(struct qla_tgt_cmd *cmd);
extern void qlt_async_event(uint16_t, struct scsi_qla_host *, uint16_t *);
--
2.43.0
Powered by blists - more mailing lists