linux-kernel - drm_gpuva_manager requirements (was Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <12f8f138-302d-83d8-3d10-4036400d5482@amd.com>
Date:   Thu, 19 Jan 2023 12:33:55 +0100
From:   Christian König <christian.koenig@....com>
To:     Matthew Brost <matthew.brost@...el.com>,
        Danilo Krummrich <dakr@...hat.com>
Cc:     Dave Airlie <airlied@...il.com>,
        Alex Deucher <alexdeucher@...il.com>, jason@...kstrand.net,
        linux-doc@...r.kernel.org, nouveau@...ts.freedesktop.org,
        corbet@....net, linux-kernel@...r.kernel.org,
        dri-devel@...ts.freedesktop.org, bskeggs@...hat.com,
        tzimmermann@...e.de, airlied@...hat.com
Subject: drm_gpuva_manager requirements (was Re: [PATCH drm-next 00/14] [RFC]
 DRM GPUVA Manager & Nouveau VM_BIND UAPI)

Am 19.01.23 um 06:23 schrieb Matthew Brost:
> [SNIP]
>>>> Userspace (generally Vulkan, some compute) has interfaces that pretty
>>>> much dictate a lot of how VMA tracking works, esp around lifetimes,
>>>> sparse mappings and splitting/merging underlying page tables, I'd
>>>> really like this to be more consistent across drivers, because already
>>>> I think we've seen with freedreno some divergence from amdgpu and we
>>>> also have i915/xe to deal with. I'd like to at least have one place
>>>> that we can say this is how it should work, since this is something
>>>> that *should* be consistent across drivers mostly, as it is more about
>>>> how the uapi is exposed.
>>> That's a really good idea, but the implementation with drm_mm won't work
>>> like that.
>>>
>>> We have Vulkan applications which use the sparse feature to create
>>> literally millions of mappings. That's why I have fine tuned the mapping
> Is this not an application issue? Millions of mappings seems a bit
> absurd to me.

That's unfortunately how some games are designed these days.

>>> structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
>>> possible in the handling of that.
> We might need to bit of work here in Xe as our xe_vma structure is quite
> big as we currently use it as dumping ground for various features.

We have done that as well and it turned out to be a bad idea. At one 
point we added some power management information into the mapping 
structure, but quickly reverted that.

>> That's a valuable information. Can you recommend such an application for
>> testing / benchmarking?
>>
> Also interested.

On of the most demanding ones is Forza Horizon 5. The general approach 
of that game seems to be to allocate 64GiB of address space (equals 16 
million 4kiB pages) and then mmap() whatever data it needs into that 
self managed space, assuming that every 4KiB page is individually 
mapable to a different location.

>> Your optimization effort sounds great. May it be worth thinking about
>> generalizing your approach by itself and stacking the drm_gpuva_manager on
>> top of it?
>>
> FWIW the Xe is on board with the drm_gpuva_manager effort, we basically
> open code all of this right now. I'd like to port over to
> drm_gpuva_manager ASAP so we can contribute and help find a viable
> solution for all of us.

Sounds good. I haven't looked into the drm_gpuva_manager code yet, but a 
few design notes I've leaned from amdgpu:

Separate address space management (drm_mm) from page table management. 
In other words when an application asks for 64GiB for free address space 
you don't look into the page table structures, but rather into a 
separate drm_mm instance. In amdgpu we even moved the later into 
userspace, but the general take away is that you have only a handful of 
address space requests while you have tons of mapping/unmapping requests.

Separate the tracking structure into two, one for each BO+VM combination 
(we call that amdgpu_bo_va) and one for each mapping (called 
amdgpu_bo_va_mapping). We unfortunately use that for our hw dependent 
state machine as well, so it isn't easily generalize-able.

I've gone back on forth on merging VMA and then not again. Not merging 
them can save quite a bit of overhead, but results in much more mappings 
for some use cases.

Regards,
Christian.

>
> Matt
>   
>>> A drm_mm_node is more in the range of ~200 bytes and certainly not
>>> suitable for this kind of job.
>>>
>>> I strongly suggest to rather use a good bunch of the amdgpu VM code as
>>> blueprint for the common infrastructure.
>> I will definitely have look.
>>
>>> Regards,
>>> Christian.
>>>
>>>> Dave.