lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250911142135.GA624@yadro.com>
Date: Thu, 11 Sep 2025 17:21:35 +0300
From: Dmitry Bogdanov <d.bogdanov@...ro.com>
To: Tony Battersby <tonyb@...ernetics.com>
CC: Nilesh Javali <njavali@...vell.com>,
	<GR-QLogic-Storage-Upstream@...vell.com>, "James E.J. Bottomley"
	<James.Bottomley@...senPartnership.com>, "Martin K. Petersen"
	<martin.petersen@...cle.com>, linux-scsi <linux-scsi@...r.kernel.org>,
	<target-devel@...r.kernel.org>, <scst-devel@...ts.sourceforge.net>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 08/15] scsi: qla2xxx: fix oops during cmd abort

On Mon, Sep 08, 2025 at 02:58:06PM -0400, Tony Battersby wrote:
> 
> (target mode)
> 
> There is a race between the following:
> 
> CPU 1:
> scst_hw_pending_work_fn() ->
> sqa_on_hw_pending_cmd_timeout() ->
> qlt_abort_cmd() ->
> qlt_unmap_sg()
> 
> CPU 2:
> qla_do_work() ->
> qla24xx_process_response_queue() ->
> qlt_do_ctio_completion() ->
> qlt_unmap_sg()
> 
> Two CPUs calling qlt_unmap_sg() on the same cmd at the same time
> results in an oops:
> 
> dma_unmap_sg_attrs()
>         BUG_ON(!valid_dma_direction(dir));
> 
> This race is more likely to happen because qlt_abort_cmd() may cause the
> hardware to send a CTIO.
> 
> The solution is to lock cmd->qpair->qp_lock_ptr when aborting a command.
> This makes it possible to check the cmd state and make decisions about
> what to do without racing with the CTIO handler and other code.
> 
> - Lock cmd->qpair->qp_lock_ptr when aborting a cmd.
> - Eliminate cmd->cmd_lock and change cmd->aborted back to a bitfield
>   since it is now protected by qp_lock_ptr just like all the other
>   flags.
> - Add another command state QLA_TGT_STATE_DONE to avoid any possible
>   races between qlt_abort_cmd() and tgt_ops->free_cmd().
> - Add the cmd->sent_term_exchg flag to indicate if
>   qlt_send_term_exchange() has already been called.
> - For SCST (scst_hw_pending_work_fn()), export qlt_send_term_exchange()
>   and qlt_unmap_sg() so that they can be called directly instead of
>   trying to make qlt_abort_cmd() work for both HW timeout and TMR abort.
> - Add TRC_CTIO_IGNORED for scst_hw_pending_work_fn().
> 
> Fixes: 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")

You are trying to fix that commit using its approach, but actually that
approach is the root cause itself. It is not ok to unmap dma while that
memory is owned by HW.

We use this patch 4 years already instead of 26f9ce53817a and never
faced with such crashes.


From: Dmitry Bogdanov <d.bogdanov@...ro.com>
Date: Wed, 20 Oct 2021 15:57:31 +0300
Subject: [PATCH] scsi: qla2xxx: clear cmds after chip reset

Commands sent to FW, after chip reset got stuck and never freed as FW is
not going to response to them anymore.

This patch partially reverts aefed3e5548f at __qla2x00_abort_all_cmds.

Fixes: aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling")
Signed-off-by: Dmitry Bogdanov <d.bogdanov@...ro.com>
---
 drivers/scsi/qla2xxx/qla_os.c     | 20 ++++++++++++++++++--
 drivers/scsi/qla2xxx/qla_target.c |  2 +-
 drivers/scsi/qla2xxx/qla_target.h |  1 +
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 407047e8b42b..04b0d3eb97e7 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -1809,10 +1809,26 @@ __qla2x00_abort_all_cmds(struct qla_qpair *qp, int res)
 					continue;
 				}
 				cmd = (struct qla_tgt_cmd *)sp;
-				cmd->aborted = 1;
+
+				if (cmd->sg_mapped)
+					qlt_unmap_sg(vha, cmd);
+
+				if (cmd->se_cmd.t_state == TRANSPORT_WRITE_PENDING) {
+					cmd->aborted = 1;
+					cmd->write_data_transferred = 0;
+					cmd->state = QLA_TGT_STATE_DATA_IN;
+					ha->tgt.tgt_ops->handle_data(cmd);
+				} else {
+					ha->tgt.tgt_ops->free_cmd(cmd);
+				}
 				break;
 			case TYPE_TGT_TMCMD:
-				/* Skip task management functions. */
+				/*
+				 * Currently, only ABTS response gets on the
+				 * outstanding_cmds[]
+				 */
+				ha->tgt.tgt_ops->free_mcmd(
+					(struct qla_tgt_mgmt_cmd *) sp);
 				break;
 			default:
 				break;
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index 8dfd13af48ea..c3b1a1426253 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -2463,7 +2463,7 @@ static int qlt_pci_map_calc_cnt(struct qla_tgt_prm *prm)
 	return -1;
 }
 
-static void qlt_unmap_sg(struct scsi_qla_host *vha, struct qla_tgt_cmd *cmd)
+void qlt_unmap_sg(struct scsi_qla_host *vha, struct qla_tgt_cmd *cmd)
 {
 	struct qla_hw_data *ha;
 	struct qla_qpair *qpair;
diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h
index 10e5e6c8087d..76e80208d731 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -1050,6 +1050,7 @@ extern int qlt_abort_cmd(struct qla_tgt_cmd *);
 extern void qlt_xmit_tm_rsp(struct qla_tgt_mgmt_cmd *);
 extern void qlt_free_mcmd(struct qla_tgt_mgmt_cmd *);
 extern void qlt_free_cmd(struct qla_tgt_cmd *cmd);
+extern void qlt_unmap_sg(struct scsi_qla_host *vha, struct qla_tgt_cmd *cmd);
 extern void qlt_async_event(uint16_t, struct scsi_qla_host *, uint16_t *);
 extern void qlt_enable_vha(struct scsi_qla_host *);
 extern void qlt_vport_create(struct scsi_qla_host *, struct qla_hw_data *);
-- 
2.25.1







Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ