[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <13cc0bf5-ecfc-4cda-ac8b-dacd714d5c41@roeck-us.net>
Date: Sat, 5 Jul 2025 12:03:06 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Shuai Xue <xueshuai@...ux.alibaba.com>
Cc: vinicius.gomes@...el.com, dave.jiang@...el.com, fenghuay@...dia.com,
vkoul@...nel.org, dmaengine@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/9] dmaengine: idxd: fix memory leak in error
handling path
Hi,
On Fri, Apr 04, 2025 at 08:02:08PM +0800, Shuai Xue wrote:
> changes since v3:
> - remove a blank line to fix checkpatch warning per Fenghua
> - collect Reviewed-by tags from Fenghua
>
>
> changes since v2:
> - add to cc stable per Markus
> - add patch 4 to fix memory leak in idxd_setup_internals per Fenghua
> - collect Reviewed-by tag for patch 2 from Fenghua
> - fix reference cnt in remove() per Fenghua
>
> changes since v1:
> - add Reviewed-by tag for patch 1-5 from Dave Jiang
> - add fixes tag
> - add patch 6 and 7 to fix memory leak in remove call per Vinicius
>
> Shuai Xue (9):
> dmaengine: idxd: fix memory leak in error handling path of
> idxd_setup_wqs
> dmaengine: idxd: fix memory leak in error handling path of
> idxd_setup_engines
> dmaengine: idxd: fix memory leak in error handling path of
> idxd_setup_groups
> dmaengine: idxd: Add missing cleanup for early error out in
> idxd_setup_internals
> dmaengine: idxd: Add missing cleanups in cleanup internals
> dmaengine: idxd: fix memory leak in error handling path of idxd_alloc
> dmaengine: idxd: fix memory leak in error handling path of
> idxd_pci_probe
> dmaengine: idxd: Add missing idxd cleanup to fix memory leak in remove
> call
> dmaengine: idxd: Refactor remove call with idxd_cleanup() helper
>
> drivers/dma/idxd/init.c | 159 ++++++++++++++++++++++++++++------------
> 1 file changed, 113 insertions(+), 46 deletions(-)
>
This patch series, as applied to 6.6 and 6.12 kernels, results in a variety
of warning backtraces and crashes when unloading idxd the driver, such as
da_free called for id=0 which is not allocated.
refcount_t: underflow; use-after-free.
list_add corruption. next->prev should be prev (ff11d2ed9908ecd0), but was ff11d2ed8a5d0ba0. (next=ff11d2ed8a5d0ba0).
Looking into it, I see that many resources are now released from functions
such as idxd_cleanup() and idxd_free(). At the same time, the calls to
put_device(idxd_confdev(idxd)) trigger calls to idxd_conf_device_release()
which releases the same resources. On top of that,
put_device(idxd_confdev(idxd)) is now called from idxd_remove() _and_ from
idxd_free() [which is called from idxd_remove()], on top of the put_device()
called from device_unregister() itself.
Does this actually work in the upstream kernel ? What prevents duplicate
release of resources from idxd_free(), idxd_cleanup(), and
idxd_conf_device_release() ? And why would idxd_remove() need the extra
put_device() call ?
Sorry if I am missing something, I just try to understand the logic behind
this patch series and why it triggers crashes and warning backtraces in 6.6
and 6.12 kernels.
Thanks,
Guenter
Powered by blists - more mailing lists