[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230309104841.7c03d5b4@collabora.com>
Date: Thu, 9 Mar 2023 10:48:41 +0100
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Danilo Krummrich <dakr@...hat.com>
Cc: airlied@...il.com, daniel@...ll.ch, tzimmermann@...e.de,
mripard@...nel.org, corbet@....net, christian.koenig@....com,
bskeggs@...hat.com, Liam.Howlett@...cle.com,
matthew.brost@...el.com, alexdeucher@...il.com, ogabbay@...nel.org,
bagasdotme@...il.com, willy@...radead.org, jason@...kstrand.net,
dri-devel@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH drm-next v2 00/16] [RFC] DRM GPUVA Manager & Nouveau
VM_BIND UAPI
On Thu, 9 Mar 2023 10:12:43 +0100
Boris Brezillon <boris.brezillon@...labora.com> wrote:
> Hi Danilo,
>
> On Fri, 17 Feb 2023 14:44:06 +0100
> Danilo Krummrich <dakr@...hat.com> wrote:
>
> > Changes in V2:
> > ==============
> > Nouveau:
> > - Reworked the Nouveau VM_BIND UAPI to avoid memory allocations in fence
> > signalling critical sections. Updates to the VA space are split up in three
> > separate stages, where only the 2. stage executes in a fence signalling
> > critical section:
> >
> > 1. update the VA space, allocate new structures and page tables
>
> Sorry for the silly question, but I didn't find where the page tables
> pre-allocation happens. Mind pointing it to me? It's also unclear when
> this step happens. Is this at bind-job submission time, when the job is
> not necessarily ready to run, potentially waiting for other deps to be
> signaled. Or is it done when all deps are met, as an extra step before
> jumping to step 2. If that's the former, then I don't see how the VA
> space update can happen, since the bind-job might depend on other
> bind-jobs modifying the same portion of the VA space (unbind ops might
> lead to intermediate page table levels disappearing while we were
> waiting for deps). If it's the latter, I wonder why this is not
> considered as an allocation in the fence signaling path (for the
> bind-job out-fence to be signaled, you need these allocations to
> succeed, unless failing to allocate page-tables is considered like a HW
> misbehavior and the fence is signaled with an error in that case).
Ok, so I just noticed you only have one bind queue per drm_file
(cli->sched_entity), and jobs are executed in-order on a given queue,
so I guess that allows you to modify the VA space at submit time
without risking any modifications to the VA space coming from other
bind-queues targeting the same VM. And, if I'm correct, synchronous
bind/unbind ops take the same path, so no risk for those to modify the
VA space either (just wonder if it's a good thing to have to sync
bind/unbind operations waiting on async ones, but that's a different
topic).
>
> Note that I'm not familiar at all with Nouveau or TTM, and it might
> be something that's solved by another component, or I'm just
> misunderstanding how the whole thing is supposed to work. This being
> said, I'd really like to implement a VM_BIND-like uAPI in pancsf using
> the gpuva_manager infra you're proposing here, so please bare with me
> :-).
>
> > 2. (un-)map the requested memory bindings
> > 3. free structures and page tables
> >
> > - Separated generic job scheduler code from specific job implementations.
> > - Separated the EXEC and VM_BIND implementation of the UAPI.
> > - Reworked the locking parts of the nvkm/vmm RAW interface, such that
> > (un-)map operations can be executed in fence signalling critical sections.
> >
>
> Regards,
>
> Boris
>
Powered by blists - more mailing lists