linux-kernel - Re: [RFC PATCH 2/4] rust: sync: Add dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <DGB7RWKMPJQZ.2PHB127O6MVVN@kernel.org>
Date: Tue, 10 Feb 2026 11:36:19 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Alice Ryhl" <aliceryhl@...gle.com>
Cc: "Boris Brezillon" <boris.brezillon@...labora.com>,
 Christian König <christian.koenig@....com>, "Philipp
 Stanner" <phasta@...lbox.org>, <phasta@...nel.org>, "David Airlie"
 <airlied@...il.com>, "Simona Vetter" <simona@...ll.ch>, "Gary Guo"
 <gary@...yguo.net>, "Benno Lossin" <lossin@...nel.org>, "Daniel Almeida"
 <daniel.almeida@...labora.com>, "Joel Fernandes" <joelagnelf@...dia.com>,
 <linux-kernel@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
 <rust-for-linux@...r.kernel.org>
Subject: Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions

On Tue Feb 10, 2026 at 11:15 AM CET, Alice Ryhl wrote:
> One way you can see this is by looking at what we require of the
> workqueue. For all this to work, it's pretty important that we never
> schedule anything on the workqueue that's not signalling safe, since
> otherwise you could have a deadlock where the workqueue is executes some
> random job calling kmalloc(GFP_KERNEL) and then blocks on our fence,
> meaning that the VM_BIND job never gets scheduled since the workqueue
> is never freed up. Deadlock.

Yes, I also pointed this out multiple times in the past in the context of C GPU
scheduler discussions. It really depends on the workqueue and how it is used.

In the C GPU scheduler the driver can pass its own workqueue to the scheduler,
which means that the driver has to ensure that at least one out of the
wq->max_active works is free for the scheduler to make progress on the
scheduler's run and free job work.

Or in other words, there must be no more than wq->max_active - 1 works that
execute code violating the DMA fence signalling rules.

This is also why the JobQ needs its own workqueue and relying on the system WQ
is unsound.

In case of an ordered workqueue, it is always a potential deadlock to schedule
work that does non-atomic allocations or takes a lock that is used elsewhere for
non-atomic allocations of course.