[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260210132837.26f7e7bc@fedora>
Date: Tue, 10 Feb 2026 13:28:37 +0100
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: "Christian König" <christian.koenig@....com>, Danilo
Krummrich <dakr@...nel.org>, Philipp Stanner <phasta@...lbox.org>,
phasta@...nel.org, David Airlie <airlied@...il.com>, Simona Vetter
<simona@...ll.ch>, Gary Guo <gary@...yguo.net>, Benno Lossin
<lossin@...nel.org>, Daniel Almeida <daniel.almeida@...labora.com>, Joel
Fernandes <joelagnelf@...dia.com>, linux-kernel@...r.kernel.org,
dri-devel@...ts.freedesktop.org, rust-for-linux@...r.kernel.org
Subject: Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions
On Tue, 10 Feb 2026 11:40:14 +0000
Alice Ryhl <aliceryhl@...gle.com> wrote:
> On Tue, Feb 10, 2026 at 11:46:44AM +0100, Christian König wrote:
> > On 2/10/26 11:36, Danilo Krummrich wrote:
> > > On Tue Feb 10, 2026 at 11:15 AM CET, Alice Ryhl wrote:
> > >> One way you can see this is by looking at what we require of the
> > >> workqueue. For all this to work, it's pretty important that we never
> > >> schedule anything on the workqueue that's not signalling safe, since
> > >> otherwise you could have a deadlock where the workqueue is executes some
> > >> random job calling kmalloc(GFP_KERNEL) and then blocks on our fence,
> > >> meaning that the VM_BIND job never gets scheduled since the workqueue
> > >> is never freed up. Deadlock.
> > >
> > > Yes, I also pointed this out multiple times in the past in the context of C GPU
> > > scheduler discussions. It really depends on the workqueue and how it is used.
> > >
> > > In the C GPU scheduler the driver can pass its own workqueue to the scheduler,
> > > which means that the driver has to ensure that at least one out of the
> > > wq->max_active works is free for the scheduler to make progress on the
> > > scheduler's run and free job work.
> > >
> > > Or in other words, there must be no more than wq->max_active - 1 works that
> > > execute code violating the DMA fence signalling rules.
>
> Ouch, is that really the best way to do that? Why not two workqueues?
Honestly, I'm wondering if we're not better off adding the concept of
DmaFenceSignalingWorkqueue on which only DmaFenceSignalingWorkItem can
be scheduled, for our own sanity.
Powered by blists - more mailing lists