[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZQBF80DkaB8Y/StN@pollux>
Date: Tue, 12 Sep 2023 13:05:23 +0200
From: Danilo Krummrich <dakr@...hat.com>
To: Thomas Hellström
<thomas.hellstrom@...ux.intel.com>
Cc: airlied@...il.com, daniel@...ll.ch, matthew.brost@...el.com,
sarah.walker@...tec.com, donald.robson@...tec.com,
boris.brezillon@...labora.com, christian.koenig@....com,
faith.ekstrand@...labora.com, dri-devel@...ts.freedesktop.org,
nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a
VM / BO combination
On Tue, Sep 12, 2023 at 12:33:14PM +0200, Thomas Hellström wrote:
>
> On 9/12/23 12:06, Danilo Krummrich wrote:
> > On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
> > > Hi, Danilo
> > >
> > > On 9/11/23 19:49, Danilo Krummrich wrote:
> > > > Hi Thomas,
> > > >
> > > > On 9/11/23 19:19, Thomas Hellström wrote:
> > > > > Hi, Danilo
> > > > >
> > > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > > > list of
> > > > > > mappings of this GEM object.
> > > > > >
> > > > > > This has multiple advantages:
> > > > > >
> > > > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > > > > of the drm_gpuvm. This is useful for tracking external and evicted
> > > > > > objects per VM, which is introduced in subsequent patches.
> > > > > >
> > > > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > > > > drm_gpuvm becomes much cheaper.
> > > > > >
> > > > > > 3) Drivers can derive and extend the structure to easily represent
> > > > > > driver specific states of a BO for a certain GPUVM.
> > > > > >
> > > > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > > > credit for
> > > > > > this idea goes to the developers of amdgpu.
> > > > > >
> > > > > > Cc: Christian König <christian.koenig@....com>
> > > > > > Signed-off-by: Danilo Krummrich <dakr@...hat.com>
> > > > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > > > their own bo definition? I figure that would mean using the gem bo's
> > > > > refcounting and providing a helper to call from the driver's bo
> > > > > release. Looks like that could potentially save a lot of code? Or is
> > > > > there something that won't work with that approach?
> > > > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > > > callback for drivers to register for that purpose.
> > > >
> > > > - Danilo
> > > Now after looking a bit deeper, I think actually the question could be
> > > rephrased as, why don't we just use the
> > > struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> > > keeping things simple? Drivers would then just embed it in their bo subclass
> > > and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> > > that don't do VM_BIND yet.
> > struct drm_gem_object::gpuva is just a container containing a list in order to
> > (currently) attach drm_gpuva structs to it and with this patch attach
> > drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
> > means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
> > the driver specific BO structure". Having a common connection between GEM
> > objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
> > patch series however.
> >
> > > Sure, this won't be per bo and per vm, but it'd really only make a slight
> > > difference where we have multiple VMAs per bo, where per-vm per-bo state
> > > either needs to be duplicated or attached to a single vma (as in the case of
> > > the external bo list).
> >
> > Correct, one implication is that we don't get a per VM and BO abstraction, and
> > hence are left with a list of all drm_gpuva structs having the same backing BO,
> > regardless of the VM.
> >
> > For amdgpu this was always a concern. Now that we want to keep track of external
> > and evicted objects it's going to be a concern for most drivers I guess. Because
> > the only structure we could use for tracking external and evicted objects we are
> > left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
> > structure isn't unique and we need to consider cases where userspace just
> > allocates rather huge BOs and creates tons of mappings from it. Running the full
> > list of drm_gpuva structs (with even the ones from other VMs included) for
> > adding an external or evicted object isn't very efficient. Not to mention that
> > the maintenance when the mapping we've (randomly) picked as an entry for the
> > external/evicted object list is unmapped, but there are still mappings left in
> > the VM with the same backing BO.
> For the evicted object it's not much of an issue; we maintain a list of vmas
> needing rebinding for each VM rather than objects evicted, so there is no or
> very little additional overhead there. The extobj list is indeed a problem
> if many VMAs are bound to the same bo. Not that the code snippets are
> complicated, but the list traversals would be excessive.
> >
> > Now, a way to get rid of the VM_BO abstraction would be to use maple trees
> > instead, since then we can store drm_gem_object structs directly for each VM.
> > However, Xe had concerns about using maple trees and preferred lists, plus
> > having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
> > abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
> > having a VM_BO abstraction enabling us to track external/evicted objects with
> > lists seems to satisfy everyone's needs.
>
> Indeed this is a tradeoff between a simple implementation that is OK for
> situations with not many VMs nor VMAs per bo vs a more complex
> implementation that optimizes for the opposite case.
>
> So if this latter is a case we need to optimize for at this point then I
> guess it's the way to go.
> (I'm in the process of adapting the xe driver to this, so I just wanted to
> bring up areas where the implementations differ quite a lot and make sure
> options are discussed).
Thanks, I appreciate that. Just be aware of the locking issue in V3 that Boris
has pointed out. I don't know if I will get to sending out a V4 today to fix
that, but I'll try to do it by tomorrow.
- Danilo
>
> Thanks,
>
> Thomas
>
>
> >
> > - Danilo
> >
> > > To me that looks like a substantial amount of less code / complexity?
> > >
> > > /Thomas
> > >
> > >
> > > > > Thanks,
> > > > >
> > > > > Thomas
> > > > >
> > > > >
>
Powered by blists - more mailing lists