[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DGB7RWKMPJQZ.2PHB127O6MVVN@kernel.org>
Date: Tue, 10 Feb 2026 11:36:19 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Alice Ryhl" <aliceryhl@...gle.com>
Cc: "Boris Brezillon" <boris.brezillon@...labora.com>,
Christian König <christian.koenig@....com>, "Philipp
Stanner" <phasta@...lbox.org>, <phasta@...nel.org>, "David Airlie"
<airlied@...il.com>, "Simona Vetter" <simona@...ll.ch>, "Gary Guo"
<gary@...yguo.net>, "Benno Lossin" <lossin@...nel.org>, "Daniel Almeida"
<daniel.almeida@...labora.com>, "Joel Fernandes" <joelagnelf@...dia.com>,
<linux-kernel@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
<rust-for-linux@...r.kernel.org>
Subject: Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions
On Tue Feb 10, 2026 at 11:15 AM CET, Alice Ryhl wrote:
> One way you can see this is by looking at what we require of the
> workqueue. For all this to work, it's pretty important that we never
> schedule anything on the workqueue that's not signalling safe, since
> otherwise you could have a deadlock where the workqueue is executes some
> random job calling kmalloc(GFP_KERNEL) and then blocks on our fence,
> meaning that the VM_BIND job never gets scheduled since the workqueue
> is never freed up. Deadlock.
Yes, I also pointed this out multiple times in the past in the context of C GPU
scheduler discussions. It really depends on the workqueue and how it is used.
In the C GPU scheduler the driver can pass its own workqueue to the scheduler,
which means that the driver has to ensure that at least one out of the
wq->max_active works is free for the scheduler to make progress on the
scheduler's run and free job work.
Or in other words, there must be no more than wq->max_active - 1 works that
execute code violating the DMA fence signalling rules.
This is also why the JobQ needs its own workqueue and relying on the system WQ
is unsound.
In case of an ordered workqueue, it is always a potential deadlock to schedule
work that does non-atomic allocations or takes a lock that is used elsewhere for
non-atomic allocations of course.
Powered by blists - more mailing lists