linux-kernel - Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250619135709.634625e0@collabora.com>
Date: Thu, 19 Jun 2025 13:57:09 +0200
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Daniel Almeida <daniel.almeida@...labora.com>
Cc: Danilo Krummrich <dakr@...nel.org>, Miguel Ojeda <ojeda@...nel.org>,
 Alex Gaynor <alex.gaynor@...il.com>, Boqun Feng <boqun.feng@...il.com>,
 Gary Guo <gary@...yguo.net>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, Benno Lossin <benno.lossin@...ton.me>, Andreas
 Hindborg <a.hindborg@...nel.org>, Alice Ryhl <aliceryhl@...gle.com>, Trevor
 Gross <tmgross@...ch.edu>, Sumit Semwal <sumit.semwal@...aro.org>,
 Christian König <christian.koenig@....com>, Alyssa
 Rosenzweig <alyssa@...enzweig.io>, Lyude Paul <lyude@...hat.com>, Maarten
 Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
 <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
 <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
 dri-devel@...ts.freedesktop.org, Asahi Lina <lina@...hilina.net>
Subject: Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

Hi,

On Fri, 13 Jun 2025 13:42:59 -0300
Daniel Almeida <daniel.almeida@...labora.com> wrote:

> Danilo,
> 
> 
> > <snip>
> >   
> >>>> +// SAFETY: DRM GpuVmBo objects are always reference counted and the get/put functions
> >>>> +// satisfy the requirements.
> >>>> +unsafe impl<T: DriverGpuVm> AlwaysRefCounted for GpuVmBo<T> {
> >>>> +    fn inc_ref(&self) {
> >>>> +        // SAFETY: The drm_gpuvm_get function satisfies the requirements for inc_ref().
> >>>> +        unsafe { bindings::drm_gpuvm_bo_get(&self.bo as *const _ as *mut _) };
> >>>> +    }
> >>>> +
> >>>> +    unsafe fn dec_ref(mut obj: NonNull<Self>) {
> >>>> +        // SAFETY: drm_gpuvm_bo_put() requires holding the gpuva lock, which is
> >>>> +        // the dma_resv lock by default.
> >>>> +        // The drm_gpuvm_put function satisfies the requirements for dec_ref().
> >>>> +        // (We do not support custom locks yet.)
> >>>> +        unsafe {
> >>>> +            let resv = (*obj.as_mut().bo.obj).resv;
> >>>> +            bindings::dma_resv_lock(resv, core::ptr::null_mut());
> >>>> +            bindings::drm_gpuvm_bo_put(&mut obj.as_mut().bo);
> >>>> +            bindings::dma_resv_unlock(resv);  
> >>> 
> >>> What if the resv_lock is held already? Please also make sure to put multiple
> >>> unsafe calls each in a separate unsafe block.  
> >> 
> >> By whom?  
> > 
> > The lock might be held already by the driver or by TTM when things are called
> > from TTM callbacks.
> > 
> > This is why GPUVM never takes locks by itself, but asserts that the correct lock
> > is held.
> > 
> > I think we really want to get proof by the driver by providing lock guard
> > references.
> >   
> 
> There doesn’t seem to be a solution that fits all the boxes here.
> 
> As you said, at this point the current status of the resv is unknown. If we
> simply assume that it is not taken, we run into the problem you pointed out:
> i.e.: recursive locking where ttm or some other layer has the lock already.
> 
> Alternatively, if we assume that the resv must be locked in dec_ref(), then we
> may build a lock::Guard from it and assert that it is held, but in any case
> it's very confusing to expect the reservation to be locked on a dec_ref() call.
> 
> The fact that dec_ref() is placed automatically on drop will massively
> complicate the call sites:

I'm digressing, but there's an aspect I found very annoying in the C
version of the API: the fact that we have to take a BO ref, then lock,
then release the vm_bo [1], because otherwise the vm_bo might be the
last owner of a BO ref leading to a UAF on the lock itself. This to me,
denotes a lifetime issue that I think would be good to address in the
rust version of the API.

It's not exactly the same problem, but I think it comes from the same
root issue: lax ownership definition. By that I mean it's not clear
who's the owner and who's the owned. gem_object::gpuva::list has
weak refs on the vm_bos contained in this list, which kinda makes sense
because vm_bos themselves have a ref on the gem_object, and if we were
to make this weak ref a strong ref we'd never free any of these two
objects. The lock is also part of the BO (either the BO resv lock, or a
custom lock), and since it's the very same lock we use to insert/remove
vm_bos, that's problematic.

If we were making the gpuvm_bo_list a separate object that's originally
created by the BO, and then let the GPUVM layer manipulate only this
list, it could work. Of course that means the resv lock/driver custom
lock should come from this object too, and I'm not too sure that's an
option when dma_buf imports are involved.

> 
> We will have to ensure that the resv is locked at all times where we interface
> with a GpuVmBo, because each of these points could possibly be the last active
> ref. If we don't, then we've introduced a race where the list is modified but
> no lock is taken, which will be a pretty easy mistake to make. This seems to
> also be the case in C, which we should try to improve upon.

Yep, with auto-unref thrown into the mix you have to be very careful on
which paths might release the last vm_bo ref, and make sure an extra
ref is taken on the BO, and the resv/custom lock is held when that
happens.

> 
> My suggestion is to introduce a separate GPU-VA lock here:
> 
> /// A base GEM object.
> #[repr(C)]
> #[pin_data]
> pub struct Object<T: DriverObject> {
>     obj: bindings::drm_gem_object,
>     // The DRM core ensures the Device exists as long as its objects exist, so we don't need to
>     // manage the reference count here.
>     dev: *const bindings::drm_device,
>     #[pin]
>     inner: T,
>     #[pin]
>     _p: PhantomPinned,
>     // Add a GPU-VA lock here <--------
> }
> 
> And only support custom locks in Rust, to the detriment of the optimization
> where the resv is used and to the detriment of any perf improvements that
> reusing the reservation might bring to the table.

Yes, if it was only about perf optimizations, then I'd like to see
numbers that prove taking an extra lock that's always going to be free
in a path where you already took the BO resv lock actually makes a
difference, and honestly, I doubt it. But my fear is that it's not so
much about avoiding an extra lock to be taken, and more about making
sure this list insertion/deletion doesn't race with other paths that
are assuming that taking the resv lock is enough to guarantee exclusive
access to this vm_bo list (I mean places outside gpuvm, in the drivers
directly). I guess the is fixable.

> 
> Notice that this would sidestep this entire discussion: nobody else would be
> aware of this new lock so we could safely lock it in dec_ref(). We would also
> be transparently managing the locking on behalf of drivers in all the other
> calls where the VA list is accessed, which is another plus as I said above.

If the lock is part of the gem_object, it's not solving the problem I
described above, because you might be taking a lock that disappears if
you don't take a BO ref before taking the lock. In the end, that's
still a risky business.

> 
> I understand that most C drivers do not need an extra lock, but it's getting
> hard to emulate this behavior in Rust.
> 
> Also, the fact that they don't need an extra lock does not invalidate the fact
> that it would be simply safer to have this extra lock anyways. In other words,
> it is still completely possible to use GPUVM without locking anything and IMHO
> we shouldn't bring this over if we can help it.

Overall, I do agree with Daniel here. We'd rather think about how to
make the C API more user-friendly by clearly defining
ownership/lifetime before we try to add rust bindings on top.
Deciding where the lock comes from is part of the discussion, but IMHO,
that's not the only thing we need to sort out.

I hope that me chiming in didn't make the situation worse :-/, and I'd
be fine if someone convince me that what I complain about here is
actually not a problem in rust :-).

Regards,

Boris

[1]https://elixir.bootlin.com/linux/v6.15.2/source/drivers/gpu/drm/panthor/panthor_mmu.c#L1090