[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <DG8AP9IH4BW7.ZIQWZKXNNH0U@kernel.org>
Date: Sat, 07 Feb 2026 01:15:55 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: <gregkh@...uxfoundation.org>, <rafael@...nel.org>, <ojeda@...nel.org>,
<boqun.feng@...il.com>, <gary@...yguo.net>, <bjorn3_gh@...tonmail.com>,
<lossin@...nel.org>, <a.hindborg@...nel.org>, <aliceryhl@...gle.com>,
<tmgross@...ch.edu>
Cc: <driver-core@...ts.linux.dev>, <rust-for-linux@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, "Boris Brezillon"
<boris.brezillon@...labora.com>, "Markus Probst" <markus.probst@...teo.de>
Subject: Re: [PATCH] rust: devres: fix race condition due to nesting
On Thu Feb 5, 2026 at 11:25 PM CET, Danilo Krummrich wrote:
> Commit f5d3ef25d238 ("rust: devres: get rid of Devres' inner Arc") did
> attempt to optimize away the internal reference count of Devres.
>
> However, without an internal reference count, we can't support cases
> where Devres is indirectly nested, resulting into a deadlock.
>
> Such indirect nesting easily happens in the following way:
>
> A registration object (which is guarded by devres) hold a reference
> count of an object that holds a device resource guarded by devres
> itself.
>
> For instance a drm::Registration holds a reference of a drm::Device. The
> drm::Device itself holds a device resource in its private data.
>
> When the drm::Registration is dropped by devres, and it happens that it
> did hold the last reference count of the drm::Device, it also drops the
> device resource, which is guarded by devres itself.
>
> Thus, resulting into a deadlock in the Devres destructor of the device
> resource, as in the following backtrace.
>
> sysrq: Show Blocked State
> task:rmmod state:D stack:0 pid:1331 tgid:1331 ppid:1330 task_flags:0x400100 flags:0x00000010
> Call trace:
> __switch_to+0x190/0x294 (T)
> __schedule+0x878/0xf10
> schedule+0x4c/0xcc
> schedule_timeout+0x44/0x118
> wait_for_common+0xc0/0x18c
> wait_for_completion+0x18/0x24
> _RINvNtCs4gKlGRWyJ5S_4core3ptr13drop_in_placeINtNtNtCsgzhNYVB7wSz_6kernel4sync3arc3ArcINtNtBN_6devres6DevresmEEECsRdyc7Hyps3_15rust_driver_pci+0x68/0xe8 [rust_driver_pci]
> _RINvNvNtCsgzhNYVB7wSz_6kernel6devres16register_foreign8callbackINtNtCs4gKlGRWyJ5S_4core3pin3PinINtNtNtB6_5alloc4kbox3BoxINtNtNtB6_4sync3arc3ArcINtB4_6DevresmEENtNtB1A_9allocator7KmallocEEECsRdyc7Hyps3_15rust_driver_pci+0x34/0xc8 [rust_driver_pci]
> devm_action_release+0x14/0x20
> devres_release_all+0xb8/0x118
> device_release_driver_internal+0x1c4/0x28c
> driver_detach+0x94/0xd4
> bus_remove_driver+0xdc/0x11c
> driver_unregister+0x34/0x58
> pci_unregister_driver+0x20/0x80
> __arm64_sys_delete_module+0x1d8/0x254
> invoke_syscall+0x40/0xcc
> el0_svc_common+0x8c/0xd8
> do_el0_svc+0x1c/0x28
> el0_svc+0x54/0x1d4
> el0t_64_sync_handler+0x84/0x12c
> el0t_64_sync+0x198/0x19c
>
> In order to fix this, re-introduce the internal reference count.
>
> Reported-by: Boris Brezillon <boris.brezillon@...labora.com>
> Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/.E2.9C.94.20Deadlock.20caused.20by.20nested.20Devres/with/571242651
> Reported-by: Markus Probst <markus.probst@...teo.de>
> Closes: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/.E2.9C.94.20Devres.20inside.20Devres.20stuck.20on.20cleanup/with/571239721
> Reported-by: Alice Ryhl <aliceryhl@...gle.com>
> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/56#note_3282757
> Fixes: f5d3ef25d238 ("rust: devres: get rid of Devres' inner Arc")
> Signed-off-by: Danilo Krummrich <dakr@...nel.org>
Applied to driver-core-testing, thanks!
[ Call clone() prior to devm_add_action(). - Danilo ]
Powered by blists - more mailing lists