lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH5fLghp+4dx6-JAfbSWDLz7WOdwtnLeuxdGhmVPax+HKbTv3w@mail.gmail.com>
Date: Wed, 1 Oct 2025 13:45:36 +0200
From: Alice Ryhl <aliceryhl@...gle.com>
To: Boris Brezillon <boris.brezillon@...labora.com>
Cc: Danilo Krummrich <dakr@...nel.org>, Matthew Brost <matthew.brost@...el.com>, 
	Thomas Hellström <thomas.hellstrom@...ux.intel.com>, 
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>, 
	Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, 
	Steven Price <steven.price@....com>, Daniel Almeida <daniel.almeida@...labora.com>, 
	Liviu Dudau <liviu.dudau@....com>, dri-devel@...ts.freedesktop.org, 
	linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: [PATCH v3 1/2] drm/gpuvm: add deferred vm_bo cleanup

On Wed, Oct 1, 2025 at 1:27 PM Boris Brezillon
<boris.brezillon@...labora.com> wrote:
>
> On Wed, 01 Oct 2025 10:41:36 +0000
> Alice Ryhl <aliceryhl@...gle.com> wrote:
>
> > When using GPUVM in immediate mode, it is necessary to call
> > drm_gpuvm_unlink() from the fence signalling critical path. However,
> > unlink may call drm_gpuvm_bo_put(), which causes some challenges:
> >
> > 1. drm_gpuvm_bo_put() often requires you to take resv locks, which you
> >    can't do from the fence signalling critical path.
> > 2. drm_gpuvm_bo_put() calls drm_gem_object_put(), which is often going
> >    to be unsafe to call from the fence signalling critical path.
> >
> > To solve these issues, add a deferred version of drm_gpuvm_unlink() that
> > adds the vm_bo to a deferred cleanup list, and then clean it up later.
> >
> > The new methods take the GEMs GPUVA lock internally rather than letting
> > the caller do it because it also needs to perform an operation after
> > releasing the mutex again. This is to prevent freeing the GEM while
> > holding the mutex (more info as comments in the patch). This means that
> > the new methods can only be used with DRM_GPUVM_IMMEDIATE_MODE.
> >
> > Reviewed-by: Boris Brezillon <boris.brezillon@...labora.com>
> > Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>

> > +/*
> > + * Must be called with GEM mutex held. After releasing GEM mutex,
> > + * drm_gpuvm_bo_defer_free_unlocked() must be called.
> > + */
> > +static void
> > +drm_gpuvm_bo_defer_free_locked(struct kref *kref)
> > +{
> > +     struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
> > +                                               kref);
> > +     struct drm_gpuvm *gpuvm = vm_bo->vm;
> > +
> > +     if (!drm_gpuvm_resv_protected(gpuvm)) {
> > +             drm_gpuvm_bo_list_del(vm_bo, extobj, true);
> > +             drm_gpuvm_bo_list_del(vm_bo, evict, true);
> > +     }
> > +
> > +     list_del(&vm_bo->list.entry.gem);
> > +}
> > +
> > +/*
> > + * GEM mutex must not be held. Called after drm_gpuvm_bo_defer_free_locked().
> > + */
> > +static void
> > +drm_gpuvm_bo_defer_free_unlocked(struct drm_gpuvm_bo *vm_bo)
> > +{
> > +     struct drm_gpuvm *gpuvm = vm_bo->vm;
> > +
> > +     llist_add(&vm_bo->list.entry.bo_defer, &gpuvm->bo_defer);
>
> Could we simply move this line to drm_gpuvm_bo_defer_free_locked()?
> I might be missing something, but I don't really see a reason to
> have it exposed as a separate operation.

No, if drm_gpuvm_bo_deferred_cleanup() is called in parallel (e.g.
from a workqueue as we discussed), then this can lead to kfreeing the
GEM while we hold the mutex. We must not add the vm_bo until it's safe
to kfree the GEM. See the comment on
drm_gpuvm_bo_defer_free_unlocked() below.

> > +}
> > +
> > +static void
> > +drm_gpuvm_bo_defer_free(struct kref *kref)
> > +{
> > +     struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
> > +                                               kref);
> > +
> > +     mutex_lock(&vm_bo->obj->gpuva.lock);
> > +     drm_gpuvm_bo_defer_free_locked(kref);
> > +     mutex_unlock(&vm_bo->obj->gpuva.lock);
> > +
> > +     /*
> > +      * It's important that the GEM stays alive for the duration in which we
> > +      * hold the mutex, but the instant we add the vm_bo to bo_defer,
> > +      * another thread might call drm_gpuvm_bo_deferred_cleanup() and put
> > +      * the GEM. Therefore, to avoid kfreeing a mutex we are holding, we add
> > +      * the vm_bo to bo_defer *after* releasing the GEM's mutex.
> > +      */
> > +     drm_gpuvm_bo_defer_free_unlocked(vm_bo);
> > +}

Alice

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ