[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250917054602.16457-1-duoming@zju.edu.cn>
Date: Wed, 17 Sep 2025 13:46:02 +0800
From: Duoming Zhou <duoming@....edu.cn>
To: netdev@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
pabeni@...hat.com,
kuba@...nel.org,
edumazet@...gle.com,
davem@...emloft.net,
andrew+netdev@...n.ch,
Duoming Zhou <duoming@....edu.cn>
Subject: [PATCH v3 net] cnic: Fix use-after-free bugs in cnic_delete_task
The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(),
which does not guarantee that the delayed work item 'delete_task' has
fully completed if it was already running. Additionally, the delayed work
item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only
blocks and waits for work items that were already queued to the
workqueue prior to its invocation. Any work items submitted after
flush_workqueue() is called are not included in the set of tasks that the
flush operation awaits. This means that after the cyclic work items have
finished executing, a delayed work item may still exist in the workqueue.
This leads to use-after-free scenarios where the cnic_dev is deallocated
by cnic_free_dev(), while delete_task remains active and attempt to
dereference cnic_dev in cnic_delete_task().
A typical race condition is illustrated below:
CPU 0 (cleanup) | CPU 1 (delayed work callback)
cnic_netdev_event() |
cnic_stop_hw() | cnic_delete_task()
cnic_cm_stop_bnx2x_hw() | ...
cancel_delayed_work() | /* the queue_delayed_work()
flush_workqueue() | executes after flush_workqueue()*/
| queue_delayed_work()
cnic_free_dev(dev)//free | cnic_delete_task() //new instance
| dev = cp->dev; //use
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the cyclic delayed work item is properly canceled and that any
ongoing execution of the work item completes before the cnic_dev is
deallocated. Furthermore, since cancel_delayed_work_sync() uses
__flush_work(work, true) to synchronously wait for any currently
executing instance of the work item to finish, the flush_workqueue()
becomes redundant and should be removed.
This bug was identified through static analysis. To reproduce the issue
and validate the fix, I simulated the cnic PCI device in QEMU and
introduced intentional delays — such as inserting calls to ssleep()
within the cnic_delete_task() function — to increase the likelihood
of triggering the bug.
Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup")
Signed-off-by: Duoming Zhou <duoming@....edu.cn>
---
Changes in v3:
- Add how I discovered and tested the patch in the commit message.
- Remove flush_workqueue() in cnic_cm_stop_bnx2x_hw().
drivers/net/ethernet/broadcom/cnic.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index a9040c42d2ff..6e97a5a7daaf 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -4230,8 +4230,7 @@ static void cnic_cm_stop_bnx2x_hw(struct cnic_dev *dev)
cnic_bnx2x_delete_wait(dev, 0);
- cancel_delayed_work(&cp->delete_task);
- flush_workqueue(cnic_wq);
+ cancel_delayed_work_sync(&cp->delete_task);
if (atomic_read(&cp->iscsi_conn) != 0)
netdev_warn(dev->netdev, "%d iSCSI connections not destroyed\n",
--
2.34.1
Powered by blists - more mailing lists