lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <871prh9952.fsf@intel.com>
Date: Tue, 17 Jun 2025 17:38:17 -0700
From: Vinicius Costa Gomes <vinicius.gomes@...el.com>
To: Fenghua Yu <fenghuay@...dia.com>, Yi Sun <yi.sun@...el.com>,
 dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: dave.jiang@...el.com, gordon.jin@...el.com
Subject: Re: [PATCH v3 2/2] dmaengine: idxd: Fix refcount underflow on
 module unload

Fenghua Yu <fenghuay@...dia.com> writes:

> Hi, Yi,
>
> On 6/17/25 03:27, Yi Sun wrote:
>> A recent refactor introduced a misplaced put_device() call, leading to a
>> reference count underflow during module unload.
>>
>> There is no need to add additional put_device() calls for idxd groups,
>> engines, or workqueues. Although commit a409e919ca3 claims:"Note, this
>> also fixes the missing put_device() for idxd groups, engines, and wqs."
>> It appears no such omission existed. The required cleanup is already
>> handled by the call chain:
>>
>>
>> Extend idxd_cleanup() to perform the necessary cleanup, and remove
>> idxd_cleanup_internals() which was not originally part of the driver
>> unload path and introduced unintended reference count underflow.
>>
>> Fixes: a409e919ca32 ("dmaengine: idxd: Refactor remove call with idxd_cleanup() helper")
>> Signed-off-by: Yi Sun <yi.sun@...el.com>
>>
>> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
>> index 40cc9c070081..40f4bf446763 100644
>> --- a/drivers/dma/idxd/init.c
>> +++ b/drivers/dma/idxd/init.c
>> @@ -1292,7 +1292,10 @@ static void idxd_remove(struct pci_dev *pdev)
>>   	device_unregister(idxd_confdev(idxd));
>>   	idxd_shutdown(pdev);
>>   	idxd_device_remove_debugfs(idxd);
>> -	idxd_cleanup(idxd);
>> +	perfmon_pmu_remove(idxd);
>> +	idxd_cleanup_interrupts(idxd);
>> +	if (device_pasid_enabled(idxd))
>> +		idxd_disable_system_pasid(idxd);
>>
> This will hit memory leak issue.
>
> idxd_remove_internals() does not only put_device() but also free 
> allocated memory for wqs, engines, groups. Without calling 
> idxd_remove_internals(), the allocated memory is leaked.
>
> I think a right fix is to remove the put_device() in 
> idxd_cleanup_wqs/engines/groups() because:
>
> 1. idxd_setup_wqs/engines/groups() does not call get_device(). Their 
> counterpart idxd_cleanup_wqs/engines/groups() shouldn't call put_device().
>
> 2. Fix the issue mentioned in this patch while there is no memory leak 
> issue.
>

In my opinion, I think the problem is a bit different, it is that the
driver is doing a lot of custom deallocation itself and not
trusting/depending on the device lifetime tracking to do the
deallocation of resources. That is, we should free the memory associated
with a device when its .release() is called.

>>   	pci_iounmap(pdev, idxd->reg_base);
>>   	put_device(idxd_confdev(idxd));
>>   	pci_disable_device(pdev);
>
> Thanks.
>
> -Fenghua
>

Cheers,
-- 
Vinicius

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ