[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <871prh9952.fsf@intel.com>
Date: Tue, 17 Jun 2025 17:38:17 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Fenghua Yu <fenghuay@...dia.com>, Yi Sun <yi.sun@...el.com>,
dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: dave.jiang@...el.com, gordon.jin@...el.com
Subject: Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on
module unload
Fenghua Yu <fenghuay@...dia.com> writes:
> Hi, Yi,
>
> On 6/17/25 03:27, Yi Sun wrote:
>> A recent refactor introduced a misplaced put_device() call, leading to a
>> reference count underflow during module unload.
>>
>> There is no need to add additional put_device() calls for idxd groups,
>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>> It appears no such omission existed. The required cleanup is already
>> handled by the call chain:
>>
>>
>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>> idxd_cleanup_internals() which was not originally part of the driver
>> unload path and introduced unintended reference count underflow.
>>
>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>> Signed-off-by: Yi Sun <yi.sun@...el.com>
>>
>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>> index 40cc9c070081..40f4bf446763 100644
>> --- a/drivers/dma/idxd/init.c
>> +++ b/drivers/dma/idxd/init.c
>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>> device_unregister(idxd_confdev(idxd));
>> idxd_shutdown(pdev);
>> idxd_device_remove_debugfs(idxd);
>> - idxd_cleanup(idxd);
>> + perfmon_pmu_remove(idxd);
>> + idxd_cleanup_interrupts(idxd);
>> + if (device_pasid_enabled(idxd))
>> + idxd_disable_system_pasid(idxd);
>>
> This will hit memory leak issue.
>
> idxd_remove_internals() does not only put_device() but also free
> allocated memory for wqs, engines, groups. Without calling
> idxd_remove_internals(), the allocated memory is leaked.
>
> I think a right fix is to remove the put_device() in
> idxd_cleanup_wqs/engines/groups() because:
>
> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their
> counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>
> 2. Fix the issue mentioned in this patch while there is no memory leak
> issue.
>
In my opinion, I think the problem is a bit different, it is that the
driver is doing a lot of custom deallocation itself and not
trusting/depending on the device lifetime tracking to do the
deallocation of resources. That is, we should free the memory associated
with a device when its .release() is called.
>> pci_iounmap(pdev, idxd->reg_base);
>> put_device(idxd_confdev(idxd));
>> pci_disable_device(pdev);
>
> Thanks.
>
> -Fenghua
>
Cheers,
--
Vinicius
Powered by blists - more mailing lists