linux-kernel - Re: [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count drm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51dea5f3-a18b-4797-b4fa-87da7db4624a@amd.com>
Date:   Mon, 6 Nov 2023 10:14:29 +0100
From:   Christian König <christian.koenig@....com>
To:     Danilo Krummrich <dakr@...hat.com>
Cc:     airlied@...il.com, daniel@...ll.ch, matthew.brost@...el.com,
        thomas.hellstrom@...ux.intel.com, sarah.walker@...tec.com,
        donald.robson@...tec.com, boris.brezillon@...labora.com,
        faith@...strand.net, dri-devel@...ts.freedesktop.org,
        nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH drm-misc-next v8 09/12] drm/gpuvm: reference count
 drm_gpuvm structures

Am 03.11.23 um 16:34 schrieb Danilo Krummrich:
[SNIP]
>>
>> Especially we most likely don't want the VM to live longer than the 
>> application which originally used it. If you make the GPUVM an 
>> independent object you actually open up driver abuse for the lifetime 
>> of this.
>
> Right, we don't want that. But I don't see how the reference count 
> prevents that.

It doesn't prevents that, it's just not the most defensive approach.

>
> Independant object is relative. struct drm_gpuvm is still embedded 
> into a driver
> specific structure. It's working the same way as with struct 
> drm_gem_obejct.
>
>>
>> Additional to that see below for a quite real problem with this.
>>
>>>> Background is that the most common use case I see is that this 
>>>> object is
>>>> embedded into something else and a reference count is then not 
>>>> really a good
>>>> idea.
>>> Do you have a specific use-case in mind where this would interfere?
>>
>> Yes, absolutely. For an example see amdgpu_mes_self_test(), here we 
>> initialize a temporary amdgpu VM for an in kernel unit test which 
>> runs during driver load.
>>
>> When the function returns I need to guarantee that the VM is 
>> destroyed or otherwise I will mess up normal operation.
>
> Nothing prevents that. The reference counting is well defined. If the 
> driver did not
> take additional references (which is clearly up to the driver taking 
> care of) and all
> VM_BOs and mappings are cleaned up, the reference count is guaranteed 
> to be 1 at this
> point.
>
> Also note that if the driver would have not cleaned up all VM_BOs and 
> mappings before
> shutting down the VM, it would have been a bug anyways and the driver 
> would potentially
> leak memory and UAF issues.

Exactly that's what I'm talking about why I think this is an extremely 
bad idea.

It's a perfect normal operation to shutdown the VM while there are still 
mappings. This is just what happens when you kill an application.

Because of this the mapping should *never* have a reference to the VM, 
but rather the VM destroys all mapping when it is destroyed itself.

> Hence, If the VM is still alive at a point where you don't expect it 
> to be, then it's
> simply a driver bug.

Driver bugs is just what I try to prevent here. When individual mappings 
keep the VM structure alive then drivers are responsible to clean them 
up, if the VM cleans up after itself then we don't need to worry about 
it in the driver.

When the mapping is destroyed with the VM drivers can't mess this common 
operation up. That's why this is more defensive.

What is a possible requirement is that external code needs to keep 
references to the VM, but *never* the VM to itself through the mappings. 
I would consider that a major bug in the component.

Regards,
Christian.

>
>>
>> Reference counting is nice when you don't know who else is referring 
>> to your VM, but the cost is that you also don't know when the object 
>> will guardedly be destroyed.
>>
>> I can trivially work around this by saying that the generic GPUVM 
>> object has a different lifetime than the amdgpu specific object, but 
>> that opens up doors for use after free again.
>
> If your driver never touches the VM's reference count and exits the VM 
> with a clean state
> (no mappings and no VM_BOs left), effectively, this is the same as 
> having no reference
> count.
>
> In the very worst case you could argue that we trade a potential UAF 
> *and* memroy leak
> (no reference count) with *only* a memory leak (with reference count), 
> which to me seems
> reasonable.
>
>>
>> Regards,
>> Christian.
>>
>>>> Thanks,
>>>> Christian.
>>> [1]https://lore.kernel.org/dri-devel/6fa058a4-20d3-44b9-af58-755cfb375d75@redhat.com/ 
>>>
>>>
>>>>> Signed-off-by: Danilo Krummrich<dakr@...hat.com>
>>>>> ---
>>>>>    drivers/gpu/drm/drm_gpuvm.c            | 44 
>>>>> +++++++++++++++++++-------
>>>>>    drivers/gpu/drm/nouveau/nouveau_uvmm.c | 20 +++++++++---
>>>>>    include/drm/drm_gpuvm.h                | 31 +++++++++++++++++-
>>>>>    3 files changed, 78 insertions(+), 17 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c 
>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>> index 53e2c406fb04..6a88eafc5229 100644
>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>> @@ -746,6 +746,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const 
>>>>> char *name,
>>>>>        gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>        INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>> +    kref_init(&gpuvm->kref);
>>>>> +
>>>>>        gpuvm->name = name ? name : "unknown";
>>>>>        gpuvm->flags = flags;
>>>>>        gpuvm->ops = ops;
>>>>> @@ -770,15 +772,8 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, const 
>>>>> char *name,
>>>>>    }
>>>>>    EXPORT_SYMBOL_GPL(drm_gpuvm_init);
>>>>> -/**
>>>>> - * drm_gpuvm_destroy() - cleanup a &drm_gpuvm
>>>>> - * @gpuvm: pointer to the &drm_gpuvm to clean up
>>>>> - *
>>>>> - * Note that it is a bug to call this function on a manager that 
>>>>> still
>>>>> - * holds GPU VA mappings.
>>>>> - */
>>>>> -void
>>>>> -drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
>>>>> +static void
>>>>> +drm_gpuvm_fini(struct drm_gpuvm *gpuvm)
>>>>>    {
>>>>>        gpuvm->name = NULL;
>>>>> @@ -790,7 +785,33 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
>>>>>        drm_gem_object_put(gpuvm->r_obj);
>>>>>    }
>>>>> -EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>> +
>>>>> +static void
>>>>> +drm_gpuvm_free(struct kref *kref)
>>>>> +{
>>>>> +    struct drm_gpuvm *gpuvm = container_of(kref, struct 
>>>>> drm_gpuvm, kref);
>>>>> +
>>>>> +    if (drm_WARN_ON(gpuvm->drm, !gpuvm->ops->vm_free))
>>>>> +        return;
>>>>> +
>>>>> +    drm_gpuvm_fini(gpuvm);
>>>>> +
>>>>> +    gpuvm->ops->vm_free(gpuvm);
>>>>> +}
>>>>> +
>>>>> +/**
>>>>> + * drm_gpuvm_bo_put() - drop a struct drm_gpuvm reference
>>>>> + * @gpuvm: the &drm_gpuvm to release the reference of
>>>>> + *
>>>>> + * This releases a reference to @gpuvm.
>>>>> + */
>>>>> +void
>>>>> +drm_gpuvm_put(struct drm_gpuvm *gpuvm)
>>>>> +{
>>>>> +    if (gpuvm)
>>>>> +        kref_put(&gpuvm->kref, drm_gpuvm_free);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_put);
>>>>>    static int
>>>>>    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>> @@ -843,7 +864,7 @@ drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>        if (unlikely(!drm_gpuvm_range_valid(gpuvm, addr, range)))
>>>>>            return -EINVAL;
>>>>> -    return __drm_gpuva_insert(gpuvm, va);
>>>>> +    return __drm_gpuva_insert(drm_gpuvm_get(gpuvm), va);
>>>>>    }
>>>>>    EXPORT_SYMBOL_GPL(drm_gpuva_insert);
>>>>> @@ -876,6 +897,7 @@ drm_gpuva_remove(struct drm_gpuva *va)
>>>>>        }
>>>>>        __drm_gpuva_remove(va);
>>>>> +    drm_gpuvm_put(va->vm);
>>>>>    }
>>>>>    EXPORT_SYMBOL_GPL(drm_gpuva_remove);
>>>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c 
>>>>> b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>>> index 54be12c1272f..cb2f06565c46 100644
>>>>> --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>>> +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>>>> @@ -1780,6 +1780,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo 
>>>>> *nvbo)
>>>>>        }
>>>>>    }
>>>>> +static void
>>>>> +nouveau_uvmm_free(struct drm_gpuvm *gpuvm)
>>>>> +{
>>>>> +    struct nouveau_uvmm *uvmm = uvmm_from_gpuvm(gpuvm);
>>>>> +
>>>>> +    kfree(uvmm);
>>>>> +}
>>>>> +
>>>>> +static const struct drm_gpuvm_ops gpuvm_ops = {
>>>>> +    .vm_free = nouveau_uvmm_free,
>>>>> +};
>>>>> +
>>>>>    int
>>>>>    nouveau_uvmm_ioctl_vm_init(struct drm_device *dev,
>>>>>                   void *data,
>>>>> @@ -1830,7 +1842,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device 
>>>>> *dev,
>>>>>                   NOUVEAU_VA_SPACE_END,
>>>>>                   init->kernel_managed_addr,
>>>>>                   init->kernel_managed_size,
>>>>> -               NULL);
>>>>> +               &gpuvm_ops);
>>>>>        /* GPUVM takes care from here on. */
>>>>>        drm_gem_object_put(r_obj);
>>>>> @@ -1849,8 +1861,7 @@ nouveau_uvmm_ioctl_vm_init(struct drm_device 
>>>>> *dev,
>>>>>        return 0;
>>>>>    out_gpuvm_fini:
>>>>> -    drm_gpuvm_destroy(&uvmm->base);
>>>>> -    kfree(uvmm);
>>>>> +    drm_gpuvm_put(&uvmm->base);
>>>>>    out_unlock:
>>>>>        mutex_unlock(&cli->mutex);
>>>>>        return ret;
>>>>> @@ -1902,7 +1913,6 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
>>>>>        mutex_lock(&cli->mutex);
>>>>>        nouveau_vmm_fini(&uvmm->vmm);
>>>>> -    drm_gpuvm_destroy(&uvmm->base);
>>>>> -    kfree(uvmm);
>>>>> +    drm_gpuvm_put(&uvmm->base);
>>>>>        mutex_unlock(&cli->mutex);
>>>>>    }
>>>>> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
>>>>> index 0c2e24155a93..4e6e1fd3485a 100644
>>>>> --- a/include/drm/drm_gpuvm.h
>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>> @@ -247,6 +247,11 @@ struct drm_gpuvm {
>>>>>            struct list_head list;
>>>>>        } rb;
>>>>> +    /**
>>>>> +     * @kref: reference count of this object
>>>>> +     */
>>>>> +    struct kref kref;
>>>>> +
>>>>>        /**
>>>>>         * @kernel_alloc_node:
>>>>>         *
>>>>> @@ -273,7 +278,23 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, 
>>>>> const char *name,
>>>>>                u64 start_offset, u64 range,
>>>>>                u64 reserve_offset, u64 reserve_range,
>>>>>                const struct drm_gpuvm_ops *ops);
>>>>> -void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>> +
>>>>> +/**
>>>>> + * drm_gpuvm_get() - acquire a struct drm_gpuvm reference
>>>>> + * @gpuvm: the &drm_gpuvm to acquire the reference of
>>>>> + *
>>>>> + * This function acquires an additional reference to @gpuvm. It 
>>>>> is illegal to
>>>>> + * call this without already holding a reference. No locks required.
>>>>> + */
>>>>> +static inline struct drm_gpuvm *
>>>>> +drm_gpuvm_get(struct drm_gpuvm *gpuvm)
>>>>> +{
>>>>> +    kref_get(&gpuvm->kref);
>>>>> +
>>>>> +    return gpuvm;
>>>>> +}
>>>>> +
>>>>> +void drm_gpuvm_put(struct drm_gpuvm *gpuvm);
>>>>>    bool drm_gpuvm_range_valid(struct drm_gpuvm *gpuvm, u64 addr, 
>>>>> u64 range);
>>>>>    bool drm_gpuvm_interval_empty(struct drm_gpuvm *gpuvm, u64 
>>>>> addr, u64 range);
>>>>> @@ -673,6 +694,14 @@ static inline void 
>>>>> drm_gpuva_init_from_op(struct drm_gpuva *va,
>>>>>     * operations to drivers.
>>>>>     */
>>>>>    struct drm_gpuvm_ops {
>>>>> +    /**
>>>>> +     * @vm_free: called when the last reference of a struct 
>>>>> drm_gpuvm is
>>>>> +     * dropped
>>>>> +     *
>>>>> +     * This callback is mandatory.
>>>>> +     */
>>>>> +    void (*vm_free)(struct drm_gpuvm *gpuvm);
>>>>> +
>>>>>        /**
>>>>>         * @op_alloc: called when the &drm_gpuvm allocates
>>>>>         * a struct drm_gpuva_op
>>
>