lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DGB7RWKMPJQZ.2PHB127O6MVVN@kernel.org>
Date: Tue, 10 Feb 2026 11:36:19 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Alice Ryhl" <aliceryhl@...gle.com>
Cc: "Boris Brezillon" <boris.brezillon@...labora.com>,
 Christian König <christian.koenig@....com>, "Philipp
 Stanner" <phasta@...lbox.org>, <phasta@...nel.org>, "David Airlie"
 <airlied@...il.com>, "Simona Vetter" <simona@...ll.ch>, "Gary Guo"
 <gary@...yguo.net>, "Benno Lossin" <lossin@...nel.org>, "Daniel Almeida"
 <daniel.almeida@...labora.com>, "Joel Fernandes" <joelagnelf@...dia.com>,
 <linux-kernel@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
 <rust-for-linux@...r.kernel.org>
Subject: Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions

On Tue Feb 10, 2026 at 11:15 AM CET, Alice Ryhl wrote:
> One way you can see this is by looking at what we require of the
> workqueue. For all this to work, it's pretty important that we never
> schedule anything on the workqueue that's not signalling safe, since
> otherwise you could have a deadlock where the workqueue is executes some
> random job calling kmalloc(GFP_KERNEL) and then blocks on our fence,
> meaning that the VM_BIND job never gets scheduled since the workqueue
> is never freed up. Deadlock.

Yes, I also pointed this out multiple times in the past in the context of C GPU
scheduler discussions. It really depends on the workqueue and how it is used.

In the C GPU scheduler the driver can pass its own workqueue to the scheduler,
which means that the driver has to ensure that at least one out of the
wq->max_active works is free for the scheduler to make progress on the
scheduler's run and free job work.

Or in other words, there must be no more than wq->max_active - 1 works that
execute code violating the DMA fence signalling rules.

This is also why the JobQ needs its own workqueue and relying on the system WQ
is unsound.

In case of an ordered workqueue, it is always a potential deadlock to schedule
work that does non-atomic allocations or takes a lock that is used elsewhere for
non-atomic allocations of course.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ