[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <925e4c2d15161ce8115409b6af97dab9a2996bf7.camel@linux.intel.com>
Date: Wed, 11 Oct 2023 10:22:20 +0200
From: Thomas Hellström
<thomas.hellstrom@...ux.intel.com>
To: Dave Airlie <airlied@...il.com>,
Thomas Hellström "(Intel)"
<thomas_os@...pmail.org>
Cc: Danilo Krummrich <dakr@...hat.com>, daniel@...ll.ch,
matthew.brost@...el.com, sarah.walker@...tec.com,
donald.robson@...tec.com, boris.brezillon@...labora.com,
christian.koenig@....com, faith.ekstrand@...labora.com,
bskeggs@...hat.com, Liam.Howlett@...cle.com,
nouveau@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize
dma_resv/extobj handling and GEM validation
On Wed, 2023-10-11 at 06:23 +1000, Dave Airlie wrote:
> > I think we're then optimizing for different scenarios. Our compute
> > driver will use mostly external objects only, and if shared, I
> > don't
> > forsee them bound to many VMs. What saves us currently here is that
> > in
> > compute mode we only really traverse the extobj list after a
> > preempt
> > fence wait, or when a vm is using a new context for the first time.
> > So
> > vm's extobj list is pretty large. Each bo's vma list will typically
> > be
> > pretty small.
>
> Can I ask why we are optimising for this userspace, this seems
> incredibly broken.
First Judging from the discussion with Christian this is not really
uncommon. There *are* ways that we can play tricks in KMD of assorted
cleverness to reduce the extobj list size, but doing that in KMD that
wouldn't be much different than accepting a large extobj list size and
do what we can to reduce overhead of iterating over it.
Second the discussion here really was about whether we should be using
a lower level lock to allow for async state updates, with a rather
complex mechanism with weak reference counting and a requirement to
drop the locks within the loop to avoid locking inversion. If that were
a simplification with little or no overhead all fine, but IMO it's not
a simplification?
>
> We've has this sort of problem in the past with Intel letting the
> tail
> wag the horse, does anyone remember optimising relocations for a
> userspace that didn't actually need to use relocations?
>
> We need to ask why this userspace is doing this, can we get some
> pointers to it? compute driver should have no reason to use mostly
> external objects, the OpenCL and level0 APIs should be good enough to
> figure this out.
TBH for the compute UMD case, I'd be prepared to drop the *performance*
argument of fine-grained locking the extobj list since it's really only
traversed on new contexts and preemption. But as Christian mentions
there might be other cases. We should perhaps figure those out and
document?
/Thoams
>
> Dave.
Powered by blists - more mailing lists