lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250908091140.44856fde@fedora>
Date: Mon, 8 Sep 2025 09:11:40 +0200
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Danilo Krummrich <dakr@...nel.org>, Matthew Brost
 <matthew.brost@...el.com>, "Thomas Hellström"
 <thomas.hellstrom@...ux.intel.com>, Maarten Lankhorst
 <maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
 Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>, Steven Price <steven.price@....com>,
 Daniel Almeida <daniel.almeida@...labora.com>, Liviu Dudau
 <liviu.dudau@....com>, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: [PATCH 1/2] drm/gpuvm: add deferred vm_bo cleanup

Hi Alice,

On Sun, 7 Sep 2025 11:39:41 +0000
Alice Ryhl <aliceryhl@...gle.com> wrote:

> On Sun, Sep 07, 2025 at 01:28:05PM +0200, Danilo Krummrich wrote:
> > On Sun Sep 7, 2025 at 1:15 PM CEST, Alice Ryhl wrote:  
> > > On Sat, Sep 06, 2025 at 12:47:36AM +0200, Danilo Krummrich wrote:  
> > >> On Fri Sep 5, 2025 at 8:18 PM CEST, Alice Ryhl wrote:  
> > >> > On Fri, Sep 5, 2025 at 3:25 PM Boris Brezillon
> > >> > <boris.brezillon@...labora.com> wrote:  
> > >> >> On Fri, 05 Sep 2025 12:11:28 +0000
> > >> >> Alice Ryhl <aliceryhl@...gle.com> wrote:  
> > >> >> > +static bool
> > >> >> > +drm_gpuvm_bo_is_dead(struct drm_gpuvm_bo *vm_bo)
> > >> >> > +{
> > >> >> > +     return !kref_read(&vm_bo->kref);  
> > >> >>
> > >> >> I'm not too sure I like the idea of [ab]using vm_bo::kref to defer the
> > >> >> vm_bo release. I get why it's done like that, but I'm wondering why we
> > >> >> don't defer the release of drm_gpuva objects instead (which is really
> > >> >> what's being released in va_unlink()). I can imagine drivers wanting to
> > >> >> attach resources to the gpuva that can't be released in the
> > >> >> dma-signalling path in the future, and if we're doing that at the gpuva
> > >> >> level, we also get rid of this kref dance, since the va will hold a
> > >> >> vm_bo ref until it's destroyed.
> > >> >>
> > >> >> Any particular reason you went for vm_bo destruction deferral instead
> > >> >> of gpuva?  
> > >> >
> > >> > All of the things that were unsafe to release in the signalling path
> > >> > were tied to the vm_bo, so that is why I went for vm_bo cleanup.
> > >> > Another advantage is that it lets us use the same deferred logic for
> > >> > the vm_bo_put() call that drops the refcount from vm_bo_obtain().
> > >> >
> > >> > Of course if gpuvas might have resources that need deferred cleanup,
> > >> > that might change the situation somewhat.  
> > >> 
> > >> I think we want to track PT(E) allocations, or rather reference counts of page
> > >> table structures carried by the drm_gpuva, but we don't need to release them on
> > >> drm_gpuva_unlink(), which is where we drop the reference count of the vm_bo.
> > >> 
> > >> Deferring drm_gpuva_unlink() isn't really an option I think, the GEMs list of
> > >> VM_BOs and the VM_BOs list of VAs is usually used in ttm_device_funcs::move to
> > >> map or unmap all VAs associated with a GEM object.
> > >> 
> > >> I think PT(E) reference counts etc. should be rather released when the drm_gpuva
> > >> is freed, i.e. page table allocations can be bound to the lifetime of a
> > >> drm_gpuva. Given that, I think that eventually we'll need a cleanup list for
> > >> those as well, since once they're removed from the VM tree (in the fence
> > >> signalling critical path), we loose access otherwise.  
> > >
> > > Hmm. Another more conceptual issue with deferring gpuva is that
> > > "immediate mode" is defined as having the GPUVM match the GPU's actual
> > > address space at all times, which deferred gpuva cleanup would go
> > > against.  
> > 
> > Depends on what "deferred gpuva cleanup" means.
> > 
> > What needs to happen in the run_job() is drm_gpuva_unlink() and
> > drm_gpuva_unmap(). Freeing the drm_gpuva, inluding releasing the assoiciated
> > driver specific resources, can be deferred.  
> 
> Yeah I guess we could have unlink remove the gpuva, but then allow the
> end-user to attach the gpuva to a list of gpuvas to kfree deferred. That
> way, the drm_gpuva_unlink() is not deferred but any resources it has can
> be.

This ^.

> 
> Of course, this approach also makes deferred gpuva cleanup somewhat
> orthogonal to this patch.

Well, yes and no, because if you go for gpuva deferred cleanup, you
don't really need the fancy kref_put() you have in this patch, it's
just a regular vm_bo_put() that's called in the deferred gpuva path on
the vm_bo attached to the gpuva being released.

> 
> One annoying part is that we don't have an gpuvm ops operation for
> freeing gpuva, and if we add one for this, it would *only* be used in
> this case as most drivers explicitly kfree gpuvas, which could be
> confusing for end-users.

Also not sure ::vm_bo_free() was meant to be used like that. It was for
drivers that need to control the drm_gpuvm_bo allocation, not those
that rely on the default implementation (kmalloc). Given how things
are described in the the doc, it feels weird to have a ::vm_bo_free()
without ::vm_bo_alloc(). So, if we decide to go this way (which I'm
still not convinced we should, given ultimately we might want to defer
gpuvas cleanup), the ::vm_bo_free() doc should be extended to cover
this 'deferred vm_bo free' case.

Regards,

Boris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ