lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <28dbc2.10c5.197529065a2.Coremail.zhengqixing@huaweicloud.com>
Date: Mon, 9 Jun 2025 10:41:24 +0800 (GMT+08:00)
From: zhengqixing@...weicloud.com
To: "Zheng Qixing" <zhengqixing@...weicloud.com>, 
	sathya.prakash@...adcom.com, sreekanth.reddy@...adcom.com, 
	suganath-prabu.subramani@...adcom.com, 
	James.Bottomley@...senPartnership.com, martin.petersen@...cle.com
Cc: MPT-FusionLinux.pdl@...adcom.com, linux-scsi@...r.kernel.org, 
	linux-kernel@...r.kernel.org, yukuai3@...wei.com, 
	yi.zhang@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v2] scsi: mpt3sas: fix uaf in
 _scsih_fw_event_cleanup_queue() during hard reset

Hi All,


Gentle ping on this patch submitted two weeks ago. Could someone please

take a look when convenient?


Thanks,

Qixing


在 2025/5/28 8:49, Zheng Qixing 写道:
&gt; From: Zheng Qixing <zhengqixing@...wei.com>
&gt;
&gt; Changes in v2:
&gt;
&gt;    1. Reference counting: Ensuring the fw_event_work has proper reference
&gt;      counting so it's not freed while still in use
&gt;    2. Read-write locks: Protecting access to ioc-&gt;current_event with locks
&gt;       to prevent race conditions between readers and writers
&gt;
&gt; During mpt3sas hard reset, there are two asynchronous execution paths that
&gt; can lead to use-after-free issues:
&gt;
&gt; Path A (cleanup):
&gt;    _base_clear_outstanding_commands()
&gt;      mpt3sas_scsih_clear_outstanding_scsi_tm_commands()
&gt;        _scsih_fw_event_cleanup_queue()
&gt; 	cancel_work_sync // UAF!
&gt;
&gt; Path B (recovery):
&gt;    _base_reset_done_handler()
&gt;      mpt3sas_scsih_reset_done_handler()
&gt;        _scsih_error_recovery_delete_devices()
&gt;          alloc_fw_event_work()
&gt;          _scsih_fw_event_add()
&gt;            _firmware_event_work()
&gt;              _mpt3sas_fw_work() // free fw_event
&gt;
&gt; Here is a use-after-free issue during hard reset:
&gt;
&gt; ==================================================================
&gt; BUG: KASAN: slab-use-after-free in __cancel_work_timer+0x172/0x3e0
&gt; Read of size 8 at addr ffff88be08e72610 by task scsi_eh_10/980
&gt;
&gt; CPU: 50 PID: 980 Comm: scsi_eh_10 Kdump: loaded Not tainted 6.6.0+ #24
&gt; Call Trace:
&gt;   <task>
&gt;   __cancel_work_timer+0x172/0x3e0
&gt;   _scsih_fw_event_cleanup_queue+0x2a2/0x570 [mpt3sas]
&gt;   mpt3sas_scsih_clear_outstanding_scsi_tm_commands+0x171/0x2c0 [mpt3sas]
&gt;   mpt3sas_base_hard_reset_handler+0x2e8/0x9d0 [mpt3sas]
&gt;   mpt3sas_scsih_issue_tm+0xaa1/0xc10 [mpt3sas]
&gt;   scsih_target_reset+0x344/0x7f0 [mpt3sas]
&gt;   scsi_try_target_reset+0xa7/0x1f0
&gt;   scsi_eh_target_reset+0x4e8/0xc50
&gt;   scsi_eh_ready_devs+0xc8/0x5b0
&gt;   scsi_unjam_host+0x2fa/0x700
&gt;   scsi_error_handler+0x434/0x700
&gt;   kthread+0x2d1/0x3b0
&gt;   ret_from_fork+0x2b/0x70
&gt;   ret_from_fork_asm+0x1b/0x30
&gt;   </task>
&gt;
&gt; Allocated by task 980:
&gt;   mpt3sas_scsih_reset_done_handler+0x575/0x7f0 [mpt3sas]
&gt;   mpt3sas_base_hard_reset_handler+0x7a7/0x9d0 [mpt3sas]
&gt;   mpt3sas_scsih_issue_tm+0xaa1/0xc10 [mpt3sas]
&gt;   scsih_dev_reset+0x354/0x8e0 [mpt3sas]
&gt;   scsi_eh_bus_device_reset+0x255/0x7a0
&gt;   scsi_eh_ready_devs+0xb6/0x5b0
&gt;   scsi_unjam_host+0x2fa/0x700
&gt;   scsi_error_handler+0x434/0x700
&gt;   kthread+0x2d1/0x3b0
&gt;   ret_from_fork+0x2b/0x70
&gt;   ret_from_fork_asm+0x1b/0x30
&gt;
&gt; Freed by task 660838:
&gt;   __kmem_cache_free+0x174/0x370
&gt;   _mpt3sas_fw_work+0x269/0x2510 [mpt3sas]
&gt;   process_one_work+0x578/0xc60
&gt;   worker_thread+0x6c0/0xc90
&gt;   kthread+0x2d1/0x3b0
&gt;   ret_from_fork+0x2b/0x70
&gt;   ret_from_fork_asm+0x1b/0x30
&gt;
&gt; Last potentially related work creation:
&gt;   insert_work+0x24/0x230
&gt;   __queue_work.part.0+0x3d2/0x840
&gt;   queue_work_on+0x4b/0x60
&gt;   _scsih_fw_event_add.part.0+0x20e/0x2c0 [mpt3sas]
&gt;   mpt3sas_scsih_reset_done_handler+0x64b/0x7f0 [mpt3sas]
&gt;   mpt3sas_base_hard_reset_handler+0x7a7/0x9d0 [mpt3sas]
&gt;   mpt3sas_scsih_issue_tm+0xaa1/0xc10 [mpt3sas]
&gt;   scsih_dev_reset+0x354/0x8e0 [mpt3sas]
&gt;   scsi_eh_bus_device_reset+0x255/0x7a0
&gt;   scsi_eh_ready_devs+0xb6/0x5b0
&gt;   scsi_unjam_host+0x2fa/0x700
&gt;   scsi_error_handler+0x434/0x700
&gt;   kthread+0x2d1/0x3b0
&gt;   ret_from_fork+0x2b/0x70
&gt;   ret_from_fork_asm+0x1b/0x30
&gt;
&gt; Second to last potentially related work creation:
&gt;   kasan_save_stack+0x21/0x40
&gt;   __kasan_record_aux_stack+0x94/0xa0
&gt;   kvfree_call_rcu+0x25/0xa20
&gt;   kernfs_unlink_open_file+0x2dd/0x410
&gt;   kernfs_fop_release+0xc4/0x320
&gt;   __fput+0x35e/0xa10
&gt;   __se_sys_close+0x4f/0xa0
&gt;   do_syscall_64+0x55/0x100
&gt;   entry_SYSCALL_64_after_hwframe+0x78/0xe2
&gt;
&gt; After commit 991df3dd5144 ("scsi: mpt3sas: Fix use-after-free warning"),
&gt; a race condition exists in _scsih_fw_event_cleanup_queue(). When Path A
&gt; dequeues a fw_event and Path B concurrently processes the same fw_event,
&gt; the reference count can drop to zero before cancel_work_sync() is called
&gt; in Path A, leading to use-after-free when accessing the already freed
&gt; fw_event structure.
&gt;
&gt; Fix this by:
&gt; 1. Protecting all accesses to ioc-&gt;current_event with ioc-&gt;fw_event_lock
&gt; 2. Adding reference counting when accessing current_event
&gt; 3. Moving the fw_event_work_put() call from dequeue_next_fw_event() to
&gt;     the caller, ensuring fw_event remains valid during cancel_work_sync()
&gt;
&gt; Fixes: 991df3dd5144 ("scsi: mpt3sas: Fix use-after-free warning")
&gt; Signed-off-by: Zheng Qixing <zhengqixing@...wei.com>
&gt; ---
&gt;   drivers/scsi/mpt3sas/mpt3sas_scsih.c | 40 +++++++++++++++++++++-------
&gt;   1 file changed, 31 insertions(+), 9 deletions(-)
&gt;
&gt; diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
&gt; index 508861e88d9f..a17963ce3f93 100644
&gt; --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
&gt; +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
&gt; @@ -3659,7 +3659,6 @@ static struct fw_event_work *dequeue_next_fw_event(struct MPT3SAS_ADAPTER *ioc)
&gt;   		fw_event = list_first_entry(&amp;ioc-&gt;fw_event_list,
&gt;   				struct fw_event_work, list);
&gt;   		list_del_init(&amp;fw_event-&gt;list);
&gt; -		fw_event_work_put(fw_event);
&gt;   	}
&gt;   	spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   
&gt; @@ -3679,10 +3678,15 @@ static void
&gt;   _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc)
&gt;   {
&gt;   	struct fw_event_work *fw_event;
&gt; +	unsigned long flags;
&gt;   
&gt; +	spin_lock_irqsave(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   	if ((list_empty(&amp;ioc-&gt;fw_event_list) &amp;&amp; !ioc-&gt;current_event) ||
&gt; -	    !ioc-&gt;firmware_event_thread)
&gt; +	    !ioc-&gt;firmware_event_thread) {
&gt; +		spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   		return;
&gt; +	}
&gt; +
&gt;   	/*
&gt;   	 * Set current running event as ignore, so that
&gt;   	 * current running event will exit quickly.
&gt; @@ -3691,10 +3695,21 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc)
&gt;   	 */
&gt;   	if (ioc-&gt;shost_recovery &amp;&amp; ioc-&gt;current_event)
&gt;   		ioc-&gt;current_event-&gt;ignore = 1;
&gt; +	spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   
&gt;   	ioc-&gt;fw_events_cleanup = 1;
&gt; -	while ((fw_event = dequeue_next_fw_event(ioc)) ||
&gt; -	     (fw_event = ioc-&gt;current_event)) {
&gt; +	while (true) {
&gt; +		fw_event = dequeue_next_fw_event(ioc);
&gt; +
&gt; +		spin_lock_irqsave(&amp;ioc-&gt;fw_event_lock, flags);
&gt; +		if (!fw_event) {
&gt; +			fw_event = ioc-&gt;current_event;
&gt; +			if (!fw_event) {
&gt; +				spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt; +				break;
&gt; +			}
&gt; +			fw_event_work_get(fw_event);
&gt; +		}
&gt;   
&gt;   		/*
&gt;   		 * Don't call cancel_work_sync() for current_event
&gt; @@ -3714,8 +3729,11 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc)
&gt;   		    ioc-&gt;current_event-&gt;event !=
&gt;   		    MPT3SAS_REMOVE_UNRESPONDING_DEVICES) {
&gt;   			ioc-&gt;current_event = NULL;
&gt; +			spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt; +			fw_event_work_put(fw_event);
&gt;   			continue;
&gt;   		}
&gt; +		spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   
&gt;   		/*
&gt;   		 * Driver has to clear ioc-&gt;start_scan flag when
&gt; @@ -3741,6 +3759,7 @@ _scsih_fw_event_cleanup_queue(struct MPT3SAS_ADAPTER *ioc)
&gt;   		if (cancel_work_sync(&amp;fw_event-&gt;work))
&gt;   			fw_event_work_put(fw_event);
&gt;   
&gt; +		fw_event_work_put(fw_event);
&gt;   	}
&gt;   	ioc-&gt;fw_events_cleanup = 0;
&gt;   }
&gt; @@ -10690,15 +10709,16 @@ mpt3sas_scsih_reset_done_handler(struct MPT3SAS_ADAPTER *ioc)
&gt;   static void
&gt;   _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
&gt;   {
&gt; +	unsigned long flags;
&gt; +
&gt; +	spin_lock_irqsave(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   	ioc-&gt;current_event = fw_event;
&gt; +	spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   	_scsih_fw_event_del_from_list(ioc, fw_event);
&gt;   
&gt;   	/* the queue is being flushed so ignore this event */
&gt; -	if (ioc-&gt;remove_host || ioc-&gt;pci_error_recovery) {
&gt; -		fw_event_work_put(fw_event);
&gt; -		ioc-&gt;current_event = NULL;
&gt; -		return;
&gt; -	}
&gt; +	if (ioc-&gt;remove_host || ioc-&gt;pci_error_recovery)
&gt; +		goto out;
&gt;   
&gt;   	switch (fw_event-&gt;event) {
&gt;   	case MPT3SAS_PROCESS_TRIGGER_DIAG:
&gt; @@ -10794,8 +10814,10 @@ _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
&gt;   		return;
&gt;   	}
&gt;   out:
&gt; +	spin_lock_irqsave(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   	fw_event_work_put(fw_event);
&gt;   	ioc-&gt;current_event = NULL;
&gt; +	spin_unlock_irqrestore(&amp;ioc-&gt;fw_event_lock, flags);
&gt;   }
&gt;   
&gt;   /**
</zhengqixing@...wei.com></zhengqixing@...wei.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ