[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f4c32d9a-7303-4a50-aafc-8039102fbd3e@amd.com>
Date: Tue, 10 Feb 2026 14:56:52 +0100
From: Christian König <christian.koenig@....com>
To: Alice Ryhl <aliceryhl@...gle.com>,
Boris Brezillon <boris.brezillon@...labora.com>
Cc: Philipp Stanner <phasta@...lbox.org>, phasta@...nel.org,
Danilo Krummrich <dakr@...nel.org>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Gary Guo <gary@...yguo.net>,
Benno Lossin <lossin@...nel.org>,
Daniel Almeida <daniel.almeida@...labora.com>,
Joel Fernandes <joelagnelf@...dia.com>, linux-kernel@...r.kernel.org,
dri-devel@...ts.freedesktop.org, rust-for-linux@...r.kernel.org
Subject: Re: [RFC PATCH 2/4] rust: sync: Add dma_fence abstractions
On 2/10/26 14:49, Alice Ryhl wrote:
> On Tue, Feb 10, 2026 at 02:26:31PM +0100, Boris Brezillon wrote:
>> On Tue, 10 Feb 2026 13:15:31 +0000
>> Alice Ryhl <aliceryhl@...gle.com> wrote:
>>
>>> On Tue, Feb 10, 2026 at 01:36:17PM +0100, Boris Brezillon wrote:
>>>> On Tue, 10 Feb 2026 10:15:04 +0000
>>>> Alice Ryhl <aliceryhl@...gle.com> wrote:
>>>>
>>>>> impl MustBeSignalled<'_> {
>>>>> /// Drivers generally should not use this one.
>>>>> fn i_promise_it_will_be_signalled(self) -> WillBeSignalled { ... }
>>>>>
>>>>> /// One way to ensure the fence has been signalled is to signal it.
>>>>> fn signal_fence(self) -> WillBeSignalled {
>>>>> self.fence.signal();
>>>>> self.i_promise_it_will_be_signalled()
>>>>> }
>>>>>
>>>>> /// Another way to ensure the fence will be signalled is to spawn a
>>>>> /// workqueue item that promises to signal it.
>>>>> fn transfer_to_wq(
>>>>> self,
>>>>> wq: &Workqueue,
>>>>> item: impl DmaFenceWorkItem,
>>>>> ) -> WillBeSignalled {
>>>>> // briefly obtain the lock class of the wq to indicate to
>>>>> // lockdep that the signalling path "blocks" on arbitrary jobs
>>>>> // from this wq completing
>>>>> bindings::lock_acquire(&wq->key);
>>>>> bindings::lock_release(&wq->key);
>>>>>
>>>>> // enqueue the job
>>>>> wq.enqueue(item, wq);
>>>>>
>>>>> // The signature of DmaFenceWorkItem::run() promises to arrange
>>>>> // for it to be signalled.
>>>>> self.i_promise_it_will_be_signalled()
>>>>> }
>>>>
>>>> I guess what's still missing is some sort of `transfer_to_hw()`
>>>> function and way to flag the IRQ handler taking over the fence
>>>> signaling token.
>>>
>>> Yes, transfer to hardware needs to be another piece of logic similar to
>>> transfer to wq. And I imagine there are many ways such a transfer to
>>> hardware could work.
>>>
>>> Unless you have a timeout on it, in which case the WillBeSignalled is
>>> satisfied by the fact you have a timeout alone, and the signalling that
>>> happens from the irq is just an opportunistic signal from outside the
>>> dma fence signalling critical path.
>>
>> Yes and no. If it deadlocks in the completion WorkItem because of
>> allocations (or any of the forbidden use cases), I think we want to
>> catch that, because that's a sign fences are likely to end up with
>> timeouts when they should have otherwise been signaled properly.
>>
>>> Well ... unless triggering timeouts can block on GFP_KERNEL
>>> allocations...
>>
>> I mean, the timeout handler should also be considered a DMA-signalling
>> path, and the same rules should apply to it.
>
> I guess that's fair. Even with a timeout you want both to be signalling
> path.
>
> I guess more generally, if a fence is signalled by mechanism A or B,
> whichever happens first, you have the choice between:
That doesn't happen in practice.
For each fence you only have one signaling path you need to guarantee forward progress for.
All other signaling paths are just opportunistically optimizations which *can* signal the fence, but there is no guarantee that they will.
We used to have some exceptions to that, especially around aborting submissions, but those turned out to be a really bad idea as well.
Thinking more about it you should probably enforce that there is only one signaling path for each fence signaling.
Regards,
Christian.
>
> 1. A in signalling path, B is not
> 2. B in signalling path, A is not
> 3. A and B both in signalling path
>
> But the downside of choosing (1.) or (2.) is that if you declare that
> event B is not in the signalling path, then B can kmalloc(GFP_KERNEL),
> which may deadlock on itself until event A happens, and if A is a
> timeout that could be a long time, so this scenario is undesirable even
> if technically it's not a deadlock because it eventually unblocks
> itself.
>
> So we should choose option (3.) and declare that both timeout and hw irq
> codepaths are signalling paths.
>
> Alice
Powered by blists - more mailing lists