linux-kernel - Re: [PATCH 1/2] drm/gpuvm: add deferred vm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <DCMJ6K06T63T.2UBTM1RL4YJ0A@kernel.org>
Date: Sun, 07 Sep 2025 13:28:05 +0200
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Alice Ryhl" <aliceryhl@...gle.com>
Cc: "Boris Brezillon" <boris.brezillon@...labora.com>, "Matthew Brost"
 <matthew.brost@...el.com>, Thomas Hellström
 <thomas.hellstrom@...ux.intel.com>, "Maarten Lankhorst"
 <maarten.lankhorst@...ux.intel.com>, "Maxime Ripard" <mripard@...nel.org>,
 "Thomas Zimmermann" <tzimmermann@...e.de>, "David Airlie"
 <airlied@...il.com>, "Simona Vetter" <simona@...ll.ch>, "Steven Price"
 <steven.price@....com>, "Daniel Almeida" <daniel.almeida@...labora.com>,
 "Liviu Dudau" <liviu.dudau@....com>, <dri-devel@...ts.freedesktop.org>,
 <linux-kernel@...r.kernel.org>, <rust-for-linux@...r.kernel.org>
Subject: Re: [PATCH 1/2] drm/gpuvm: add deferred vm_bo cleanup

On Sun Sep 7, 2025 at 1:15 PM CEST, Alice Ryhl wrote:
> On Sat, Sep 06, 2025 at 12:47:36AM +0200, Danilo Krummrich wrote:
>> On Fri Sep 5, 2025 at 8:18 PM CEST, Alice Ryhl wrote:
>> > On Fri, Sep 5, 2025 at 3:25 PM Boris Brezillon
>> > <boris.brezillon@...labora.com> wrote:
>> >> On Fri, 05 Sep 2025 12:11:28 +0000
>> >> Alice Ryhl <aliceryhl@...gle.com> wrote:
>> >> > +static bool
>> >> > +drm_gpuvm_bo_is_dead(struct drm_gpuvm_bo *vm_bo)
>> >> > +{
>> >> > +     return !kref_read(&vm_bo->kref);
>> >>
>> >> I'm not too sure I like the idea of [ab]using vm_bo::kref to defer the
>> >> vm_bo release. I get why it's done like that, but I'm wondering why we
>> >> don't defer the release of drm_gpuva objects instead (which is really
>> >> what's being released in va_unlink()). I can imagine drivers wanting to
>> >> attach resources to the gpuva that can't be released in the
>> >> dma-signalling path in the future, and if we're doing that at the gpuva
>> >> level, we also get rid of this kref dance, since the va will hold a
>> >> vm_bo ref until it's destroyed.
>> >>
>> >> Any particular reason you went for vm_bo destruction deferral instead
>> >> of gpuva?
>> >
>> > All of the things that were unsafe to release in the signalling path
>> > were tied to the vm_bo, so that is why I went for vm_bo cleanup.
>> > Another advantage is that it lets us use the same deferred logic for
>> > the vm_bo_put() call that drops the refcount from vm_bo_obtain().
>> >
>> > Of course if gpuvas might have resources that need deferred cleanup,
>> > that might change the situation somewhat.
>> 
>> I think we want to track PT(E) allocations, or rather reference counts of page
>> table structures carried by the drm_gpuva, but we don't need to release them on
>> drm_gpuva_unlink(), which is where we drop the reference count of the vm_bo.
>> 
>> Deferring drm_gpuva_unlink() isn't really an option I think, the GEMs list of
>> VM_BOs and the VM_BOs list of VAs is usually used in ttm_device_funcs::move to
>> map or unmap all VAs associated with a GEM object.
>> 
>> I think PT(E) reference counts etc. should be rather released when the drm_gpuva
>> is freed, i.e. page table allocations can be bound to the lifetime of a
>> drm_gpuva. Given that, I think that eventually we'll need a cleanup list for
>> those as well, since once they're removed from the VM tree (in the fence
>> signalling critical path), we loose access otherwise.
>
> Hmm. Another more conceptual issue with deferring gpuva is that
> "immediate mode" is defined as having the GPUVM match the GPU's actual
> address space at all times, which deferred gpuva cleanup would go
> against.

Depends on what "deferred gpuva cleanup" means.

What needs to happen in the run_job() is drm_gpuva_unlink() and
drm_gpuva_unmap(). Freeing the drm_gpuva, inluding releasing the assoiciated
driver specific resources, can be deferred.

> Deferring vm_bo cleanup doesn't have this issue because even though the
> vm_bo isn't kfreed immediately, all GPUVM apis still treat it as-if it
> isn't there anymore.