linux-kernel - Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <477186AF-A8D4-4D4E-9F58-86958C654333@collabora.com>
Date: Thu, 19 Jun 2025 10:49:34 -0300
From: Daniel Almeida <daniel.almeida@...labora.com>
To: Boris Brezillon <boris.brezillon@...labora.com>
Cc: Danilo Krummrich <dakr@...nel.org>,
 Miguel Ojeda <ojeda@...nel.org>,
 Alex Gaynor <alex.gaynor@...il.com>,
 Boqun Feng <boqun.feng@...il.com>,
 Gary Guo <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>,
 Benno Lossin <benno.lossin@...ton.me>,
 Andreas Hindborg <a.hindborg@...nel.org>,
 Alice Ryhl <aliceryhl@...gle.com>,
 Trevor Gross <tmgross@...ch.edu>,
 Sumit Semwal <sumit.semwal@...aro.org>,
 Christian König <christian.koenig@....com>,
 Alyssa Rosenzweig <alyssa@...enzweig.io>,
 Lyude Paul <lyude@...hat.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>,
 Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>,
 linux-kernel@...r.kernel.org,
 rust-for-linux@...r.kernel.org,
 dri-devel@...ts.freedesktop.org,
 Asahi Lina <lina@...hilina.net>
Subject: Re: [PATCH v2 2/2] rust: drm: Add GPUVM abstraction

Hi Boris,

> On 19 Jun 2025, at 08:57, Boris Brezillon <boris.brezillon@...labora.com> wrote:
> 
> Hi,
> 
> On Fri, 13 Jun 2025 13:42:59 -0300
> Daniel Almeida <daniel.almeida@...labora.com> wrote:
> 
>> Danilo,
>> 
>> 
>>> <snip>
>>> 
>>>>>> +// SAFETY: DRM GpuVmBo objects are always reference counted and the get/put functions
>>>>>> +// satisfy the requirements.
>>>>>> +unsafe impl<T: DriverGpuVm> AlwaysRefCounted for GpuVmBo<T> {
>>>>>> +    fn inc_ref(&self) {
>>>>>> +        // SAFETY: The drm_gpuvm_get function satisfies the requirements for inc_ref().
>>>>>> +        unsafe { bindings::drm_gpuvm_bo_get(&self.bo as *const _ as *mut _) };
>>>>>> +    }
>>>>>> +
>>>>>> +    unsafe fn dec_ref(mut obj: NonNull<Self>) {
>>>>>> +        // SAFETY: drm_gpuvm_bo_put() requires holding the gpuva lock, which is
>>>>>> +        // the dma_resv lock by default.
>>>>>> +        // The drm_gpuvm_put function satisfies the requirements for dec_ref().
>>>>>> +        // (We do not support custom locks yet.)
>>>>>> +        unsafe {
>>>>>> +            let resv = (*obj.as_mut().bo.obj).resv;
>>>>>> +            bindings::dma_resv_lock(resv, core::ptr::null_mut());
>>>>>> +            bindings::drm_gpuvm_bo_put(&mut obj.as_mut().bo);
>>>>>> +            bindings::dma_resv_unlock(resv);  
>>>>> 
>>>>> What if the resv_lock is held already? Please also make sure to put multiple
>>>>> unsafe calls each in a separate unsafe block.  
>>>> 
>>>> By whom?  
>>> 
>>> The lock might be held already by the driver or by TTM when things are called
>>> from TTM callbacks.
>>> 
>>> This is why GPUVM never takes locks by itself, but asserts that the correct lock
>>> is held.
>>> 
>>> I think we really want to get proof by the driver by providing lock guard
>>> references.
>>> 
>> 
>> There doesn’t seem to be a solution that fits all the boxes here.
>> 
>> As you said, at this point the current status of the resv is unknown. If we
>> simply assume that it is not taken, we run into the problem you pointed out:
>> i.e.: recursive locking where ttm or some other layer has the lock already.
>> 
>> Alternatively, if we assume that the resv must be locked in dec_ref(), then we
>> may build a lock::Guard from it and assert that it is held, but in any case
>> it's very confusing to expect the reservation to be locked on a dec_ref() call.
>> 
>> The fact that dec_ref() is placed automatically on drop will massively
>> complicate the call sites:
> 
> I'm digressing, but there's an aspect I found very annoying in the C
> version of the API: the fact that we have to take a BO ref, then lock,
> then release the vm_bo [1], because otherwise the vm_bo might be the
> last owner of a BO ref leading to a UAF on the lock itself. This to me,
> denotes a lifetime issue that I think would be good to address in the
> rust version of the API.
> 
> It's not exactly the same problem, but I think it comes from the same
> root issue: lax ownership definition. By that I mean it's not clear
> who's the owner and who's the owned. gem_object::gpuva::list has
> weak refs on the vm_bos contained in this list, which kinda makes sense
> because vm_bos themselves have a ref on the gem_object, and if we were
> to make this weak ref a strong ref we'd never free any of these two
> objects. The lock is also part of the BO (either the BO resv lock, or a
> custom lock), and since it's the very same lock we use to insert/remove
> vm_bos, that's problematic.
> 
> If we were making the gpuvm_bo_list a separate object that's originally
> created by the BO, and then let the GPUVM layer manipulate only this
> list, it could work. Of course that means the resv lock/driver custom
> lock should come from this object too, and I'm not too sure that's an
> option when dma_buf imports are involved.

This would require changes to the C side, because the C helpers expect the gpuva
list to be inside drm_gem_object, and these helpers are used in the Rust
bindings.

Don't get me wrong, IMHO we can definitely pursue this if needed, or if it's an
improvement. I am merely stating the fact that it's a more elaborate solution
that what I had originally proposed so that we are all aware.

> 
>> 
>> We will have to ensure that the resv is locked at all times where we interface
>> with a GpuVmBo, because each of these points could possibly be the last active
>> ref. If we don't, then we've introduced a race where the list is modified but
>> no lock is taken, which will be a pretty easy mistake to make. This seems to
>> also be the case in C, which we should try to improve upon.
> 
> Yep, with auto-unref thrown into the mix you have to be very careful on
> which paths might release the last vm_bo ref, and make sure an extra
> ref is taken on the BO, and the resv/custom lock is held when that
> happens.
> 
>> 
>> My suggestion is to introduce a separate GPU-VA lock here:
>> 
>> /// A base GEM object.
>> #[repr(C)]
>> #[pin_data]
>> pub struct Object<T: DriverObject> {
>>    obj: bindings::drm_gem_object,
>>    // The DRM core ensures the Device exists as long as its objects exist, so we don't need to
>>    // manage the reference count here.
>>    dev: *const bindings::drm_device,
>>    #[pin]
>>    inner: T,
>>    #[pin]
>>    _p: PhantomPinned,
>>    // Add a GPU-VA lock here <--------
>> }
>> 
>> And only support custom locks in Rust, to the detriment of the optimization
>> where the resv is used and to the detriment of any perf improvements that
>> reusing the reservation might bring to the table.
> 
> Yes, if it was only about perf optimizations, then I'd like to see
> numbers that prove taking an extra lock that's always going to be free
> in a path where you already took the BO resv lock actually makes a
> difference, and honestly, I doubt it. But my fear is that it's not so
> much about avoiding an extra lock to be taken, and more about making
> sure this list insertion/deletion doesn't race with other paths that
> are assuming that taking the resv lock is enough to guarantee exclusive
> access to this vm_bo list (I mean places outside gpuvm, in the drivers
> directly). I guess the is fixable.
> 
>> 
>> Notice that this would sidestep this entire discussion: nobody else would be
>> aware of this new lock so we could safely lock it in dec_ref(). We would also
>> be transparently managing the locking on behalf of drivers in all the other
>> calls where the VA list is accessed, which is another plus as I said above.
> 
> If the lock is part of the gem_object, it's not solving the problem I
> described above, because you might be taking a lock that disappears if
> you don't take a BO ref before taking the lock. In the end, that's
> still a risky business.

I must confess I was not aware of the issue you're discussing. We can add an
instance of gem::ObjectRef in gpuvm::GpuVmBo if you think this will solve it.
This will make sure that the vmbo owns the gem object (or rather that it has a
+1 on the refcount) for its lifetime. That seems to be the semantics you're
looking for?

I thought about adding this lock to the Rust wrapper for the vmbo, but then
we'd run into the issue of shared BOs, where each would be connected to a
different vmbo and would thus take a different lock when attempting to modify
the same underlying gpuva list.

Just to be clear, by shared BO here I specifically mean a BO that's connected
to more than one VM, which is possible, IIUC.

— Daniel