[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <39398407-009e-4afe-acb6-e3de931627d7@nvidia.com>
Date: Tue, 17 Jun 2025 14:58:15 -0700
From: Fenghua Yu <fenghuay@...dia.com>
To: Yi Sun <yi.sun@...el.com>, vinicius.gomes@...el.com,
dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: dave.jiang@...el.com, gordon.jin@...el.com
Subject: Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module
unload
Hi, Yi,
On 6/17/25 03:27, Yi Sun wrote:
> A recent refactor introduced a misplaced put_device() call, leading to a
> reference count underflow during module unload.
>
> There is no need to add additional put_device() calls for idxd groups,
> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
> also fixes the missing put_device() for idxd groups, engines, and wqs."
> It appears no such omission existed. The required cleanup is already
> handled by the call chain:
>
>
> Extend idxd_cleanup() to perform the necessary cleanup, and remove
> idxd_cleanup_internals() which was not originally part of the driver
> unload path and introduced unintended reference count underflow.
>
> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
> Signed-off-by: Yi Sun <yi.sun@...el.com>
>
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index 40cc9c070081..40f4bf446763 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
> device_unregister(idxd_confdev(idxd));
> idxd_shutdown(pdev);
> idxd_device_remove_debugfs(idxd);
> - idxd_cleanup(idxd);
> + perfmon_pmu_remove(idxd);
> + idxd_cleanup_interrupts(idxd);
> + if (device_pasid_enabled(idxd))
> + idxd_disable_system_pasid(idxd);
>
This will hit memory leak issue.
idxd_remove_internals() does not only put_device() but also free
allocated memory for wqs, engines, groups. Without calling
idxd_remove_internals(), the allocated memory is leaked.
I think a right fix is to remove the put_device() in
idxd_cleanup_wqs/engines/groups() because:
1. idxd_setup_wqs/engines/groups() does not call get_device(). Their
counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
2. Fix the issue mentioned in this patch while there is no memory leak
issue.
> pci_iounmap(pdev, idxd->reg_base);
> put_device(idxd_confdev(idxd));
> pci_disable_device(pdev);
Thanks.
-Fenghua
Powered by blists - more mailing lists