[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aIXuYVtGSV0OHHps@ysun46-mobl.ccr.corp.intel.com>
Date: Sun, 27 Jul 2025 17:16:17 +0800
From: Yi Sun <yi.sun@...el.com>
To: Fenghua Yu <fenghuay@...dia.com>
CC: <vinicius.gomes@...el.com>, <dmaengine@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <dave.jiang@...el.com>,
<gordon.jin@...el.com>
Subject: Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on module
unload
On 17.06.2025 14:58, Fenghua Yu wrote:
>Hi, Yi,
>
>On 6/17/25 03:27, Yi Sun wrote:
>>A recent refactor introduced a misplaced put_device() call, leading to a
>>reference count underflow during module unload.
>>
>>There is no need to add additional put_device() calls for idxd groups,
>>engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>>also fixes the missing put_device() for idxd groups, engines, and wqs."
>>It appears no such omission existed. The required cleanup is already
>>handled by the call chain:
>>
>>
>>Extend idxd_cleanup() to perform the necessary cleanup, and remove
>>idxd_cleanup_internals() which was not originally part of the driver
>>unload path and introduced unintended reference count underflow.
>>
>>Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>>Signed-off-by: Yi Sun <yi.sun@...el.com>
>>
>>diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>>index 40cc9c070081..40f4bf446763 100644
>>--- a/drivers/dma/idxd/init.c
>>+++ b/drivers/dma/idxd/init.c
>>@@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>> device_unregister(idxd_confdev(idxd));
>> idxd_shutdown(pdev);
>> idxd_device_remove_debugfs(idxd);
>>- idxd_cleanup(idxd);
>>+ perfmon_pmu_remove(idxd);
>>+ idxd_cleanup_interrupts(idxd);
>>+ if (device_pasid_enabled(idxd))
>>+ idxd_disable_system_pasid(idxd);
>>
>This will hit memory leak issue.
>
>idxd_remove_internals() does not only put_device() but also free
>allocated memory for wqs, engines, groups. Without calling
>idxd_remove_internals(), the allocated memory is leaked.
>
>I think a right fix is to remove the put_device() in
>idxd_cleanup_wqs/engines/groups() because:
>
>1. idxd_setup_wqs/engines/groups() does not call get_device(). Their
>counterpart idxd_cleanup_wqs/engines/groups() shouldn't call
>put_device().
>
>2. Fix the issue mentioned in this patch while there is no memory leak
>issue.
>
>> pci_iounmap(pdev, idxd->reg_base);
>> put_device(idxd_confdev(idxd));
>> pci_disable_device(pdev);
>
>Thanks.
>
>-Fenghua
>
Hi Fenghua,
As with the comments on the previous patch, the function
idxd_conf_device_release already covers part of what is done in
idxd_remove_internals, including:
kfree(idxd->groups);
kfree(idxd->wqs);
kfree(idxd->engines);
kfree(idxd);
We need to redesign idxd_remove_internals to clearly identify what
truely result in memory leaks and should be handled there.
Then, we'll need another patch to fix the idxd_remove_internals and call
it here.
Let's prioritize addressing the two critical issues we've encountered here,
and then revisit the memory leak discussion afterward.
Thanks
--Sun, Yi
Powered by blists - more mailing lists