[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <682a788c21c6513e30c5eb1020191f31f7aec612.camel@mailbox.org>
Date: Tue, 25 Nov 2025 13:20:03 +0100
From: Philipp Stanner <phasta@...lbox.org>
To: Boris Brezillon <boris.brezillon@...labora.com>
Cc: phasta@...nel.org, Daniel Almeida <daniel.almeida@...labora.com>, Alice
Ryhl <aliceryhl@...gle.com>, Danilo Krummrich <dakr@...nel.org>, Christian
König <ckoenig.leichtzumerken@...il.com>, Tvrtko Ursulin
<tursulin@...ulin.net>, Alexandre Courbot <acourbot@...dia.com>, Dave
Airlie <airlied@...hat.com>, Lyude Paul <lyude@...hat.com>, Peter Colberg
<pcolberg@...hat.com>, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: [RFC WIP 2/3] rust: sync: Add dma_fence abstractions
On Tue, 2025-11-25 at 11:58 +0100, Boris Brezillon wrote:
> On Tue, 25 Nov 2025 10:48:12 +0100
> Philipp Stanner <phasta@...lbox.org> wrote:
>
> > > > +impl ArcBorrow<'_, DmaFenceCtx> {
> > > > + /// Create a new fence, consuming `data`.
> > > > + ///
> > > > + /// The fence will increment the refcount of the fence context associated with this
> > > > + /// [`DmaFenceCtx`].
> > > > + pub fn new_fence<T>(
> > > > + &mut self,
> > > > + data: impl PinInit<T>,
> > > > + ) -> Result<ARef<DmaFence<T>>> {
> > > > + let fctx: Arc<DmaFenceCtx> = (*self).into();
> > > > + let seqno: u64 = fctx.get_new_fence_seqno();
> > > > +
> > > > + // TODO: Should we reset seqno in case of failure?
> > >
> > > I think we should go back to the old value, yeah.
> >
> > It would be trivial to implement that (just atomic.decrement()).
> >
> > The thing why the TODO even exists is that I'm a bit unsure about
> > races. It seems we have to choose between either a gap in the seqnos or
> > the possiblity of seqnos being out of order.
> >
> > If the user / driver creates fences with >1 thread on a fence context,
> > I mean.
> >
> > We're pretty free in our choices, however. The shared fence-fctx
> > spinlock will be removed anyways, so one could later easily replace the
> > fctx atomic with a lock if that's desirable.
> >
> > I can implement a seqno-decrement for now.
>
> I don't think we need to return unused seqnos in case of failure. I
> mean, we could have something like the following pseudo-code:
>
> atomic_cmpxchg(ctx.seqno, fence.seqno + 1, fence.seqno)
The code above is the code that creates fence.seqno in the first place,
obtaining the next seqno from the fctx (timeline).
>
> but it wouldn't cover the case where fences are not returned in the
> order they were assigned, and seqnos are pretty cheap anyway (if a u64
> is enough to count things in nanoseconds for hundreds of years, they are
> more than enough for a fence timeline on which fences are emitted at a
> way lower rate, even in case of recurring failures). The guarantee we
> really care about is seqnos not going backward, because that would mess
> up with the assumption that fences on a given timeline/ctx are signalled
> in order (this assumption is used to coalesce fences in a
> fence_array/resv IIRC).
Agreed, everything should signal according to the seqno.
The question we are facing with Rust (or rather, my design here) is
rather to what degree the infrastructure shall enforce this. As you
know, in C there isn't even a real "fence context", it's just an
abstract concept, represented by an integer maintained by the driver.
In Rust we can model it more exactly. For instance, we could enforce
ordered signaling by creating a function as the only way to signal
fences:
fctx.signal_all_fences_up_to_seqno(9042);
which then iterates over the fences, signaling all in order.
With some other APIs, such as jobqueue.submit_job(), which creates a
fence with the code above, it's trivial as long as the driver only
calls submit_job() with 1 thread.
If the driver came up with the idea of using >=2 threads firing on
submit_job(), then you by design have ordering issues, independently
from the fence context using this or that atomic operation or even full
locking.
If the driver scrambles calls to submit_job() or new_fence(), then it
can certainly happen that done_fences are signaled by JQ out of order –
though I'm not sure how horrible that would actually be, considering
that this does not imply that anything gets executed before all
dependencies are fullfiled. AFAICS it's more "nice to have / would be
the cleanest possible design".
I think I have a TODO in jobqueue where we could solve that by only
signaling pending done_fence's once all preceding fences have been
signaled.
P.
Powered by blists - more mailing lists