linux-kernel - Re: [RFC WIP 2/3] rust: sync: Add dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251125145040.350e54a9@fedora>
Date: Tue, 25 Nov 2025 14:50:40 +0100
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Philipp Stanner <phasta@...lbox.org>
Cc: phasta@...nel.org, Daniel Almeida <daniel.almeida@...labora.com>, Alice
 Ryhl <aliceryhl@...gle.com>, Danilo Krummrich <dakr@...nel.org>, Christian
 König <ckoenig.leichtzumerken@...il.com>, Tvrtko Ursulin
 <tursulin@...ulin.net>, Alexandre Courbot <acourbot@...dia.com>, Dave
 Airlie <airlied@...hat.com>, Lyude Paul <lyude@...hat.com>, Peter Colberg
 <pcolberg@...hat.com>, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: [RFC WIP 2/3] rust: sync: Add dma_fence abstractions

On Tue, 25 Nov 2025 13:20:03 +0100
Philipp Stanner <phasta@...lbox.org> wrote:

> On Tue, 2025-11-25 at 11:58 +0100, Boris Brezillon wrote:
> > On Tue, 25 Nov 2025 10:48:12 +0100
> > Philipp Stanner <phasta@...lbox.org> wrote:
> >   
> > > > > +impl ArcBorrow<'_, DmaFenceCtx> {
> > > > > +    /// Create a new fence, consuming `data`.
> > > > > +    ///
> > > > > +    /// The fence will increment the refcount of the fence context associated with this
> > > > > +    /// [`DmaFenceCtx`].
> > > > > +    pub fn new_fence<T>(
> > > > > +        &mut self,
> > > > > +        data: impl PinInit<T>,
> > > > > +    ) -> Result<ARef<DmaFence<T>>> {
> > > > > +        let fctx: Arc<DmaFenceCtx> = (*self).into();
> > > > > +        let seqno: u64 = fctx.get_new_fence_seqno();
> > > > > +
> > > > > +        // TODO: Should we reset seqno in case of failure?    
> > > > 
> > > > I think we should go back to the old value, yeah.    
> > > 
> > > It would be trivial to implement that (just atomic.decrement()).
> > > 
> > > The thing why the TODO even exists is that I'm a bit unsure about
> > > races. It seems we have to choose between either a gap in the seqnos or
> > > the possiblity of seqnos being out of order.
> > > 
> > > If the user / driver creates fences with >1 thread on a fence context,
> > > I mean.
> > > 
> > > We're pretty free in our choices, however. The shared fence-fctx
> > > spinlock will be removed anyways, so one could later easily replace the
> > > fctx atomic with a lock if that's desirable.
> > > 
> > > I can implement a seqno-decrement for now.  
> > 
> > I don't think we need to return unused seqnos in case of failure. I
> > mean, we could have something like the following pseudo-code:
> > 
> > 	atomic_cmpxchg(ctx.seqno, fence.seqno + 1, fence.seqno)  
> 
> The code above is the code that creates fence.seqno in the first place,
> obtaining the next seqno from the fctx (timeline).

Not sure I follow, but the pseudo-code I pasted is actually meant to
be the reciprocal of the seqno allocation (decrement if the current ctx
seqno is exactly returned_seqno + 1).

> 
> > 
> > but it wouldn't cover the case where fences are not returned in the
> > order they were assigned, and seqnos are pretty cheap anyway (if a u64
> > is enough to count things in nanoseconds for hundreds of years, they are
> > more than enough for a fence timeline on which fences are emitted at a
> > way lower rate, even in case of recurring failures). The guarantee we
> > really care about is seqnos not going backward, because that would mess
> > up with the assumption that fences on a given timeline/ctx are signalled
> > in order (this assumption is used to coalesce fences in a
> > fence_array/resv IIRC).  
> 
> Agreed, everything should signal according to the seqno.
> 
> The question we are facing with Rust (or rather, my design here) is
> rather to what degree the infrastructure shall enforce this. As you
> know, in C there isn't even a real "fence context", it's just an
> abstract concept, represented by an integer maintained by the driver.
> 
> In Rust we can model it more exactly. For instance, we could enforce
> ordered signaling by creating a function as the only way to signal
> fences:
> 
> fctx.signal_all_fences_up_to_seqno(9042);
> 
> which then iterates over the fences, signaling all in order.

Up to you, but then it implies keeping a list of currently active
fences attached to the context, plus some locks to protect this list.
Another option would be to track the last signalled seqno at the
context level, and complain if a fence with a lower seqno is signalled.

> 
> 
> With some other APIs, such as jobqueue.submit_job(), which creates a
> fence with the code above, it's trivial as long as the driver only
> calls submit_job() with 1 thread.

As I was explaining to Daniel yesterday, you need some sort of
serialization when you submit to the same context anyway. In Tyr,
things will be serialized through the GPUVM resv, which guarantees that
no more than one thread can allocate seqnos on a given context inside
this critical section.

> 
> If the driver came up with the idea of using >=2 threads firing on
> submit_job(), then you by design have ordering issues, independently
> from the fence context using this or that atomic operation or even full
> locking.

In practice you don't, because submission to a given context are
serialized one way or another (see above). Maybe what we should assume
is that only one thread gets a mutable ref to this FenceCtx/JobQueue,
and seqno allocation is something requiring mutability. The locking is
something that's provided by the layer giving a mutable ref to this
fence context.

> 
> If the driver scrambles calls to submit_job() or new_fence(), then it
> can certainly happen that done_fences are signaled by JQ out of order –
> though I'm not sure how horrible that would actually be, considering
> that this does not imply that anything gets executed before all
> dependencies are fullfiled. AFAICS it's more "nice to have / would be
> the cleanest possible design".

A fence context should really be bound to a GPU queue, and jobs on a
GPU queue are expected to be executed in order. So long as the jobs are
queued in order, they should be executed and signalled in order. Of
course, this implies some locking when you prepare/queue jobs because
preparation and queuing are two distinct steps, and if you let 2
threads do that concurrently, the queuing won't be ordered. That's
already stuff drivers have to deal with today, so I'm not sure we
should make a difference for rust drivers, other than making this
explicit by requiring mutable JobQueue/FenceCtx references in
situations where we expect some higher-level locking to be in place.

> 
> I think I have a TODO in jobqueue where we could solve that by only
> signaling pending done_fence's once all preceding fences have been
> signaled.

I believe this is something we want to enforce, not fix. If signalling
is done out of order, that's probably a driver bug, and it should be
reported as such IMHO.