[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4d93a84-5a8d-4f1e-874e-901b13570e92@amd.com>
Date: Fri, 16 Jan 2026 20:34:15 +0800
From: Honglei Huang <honghuan@....com>
To: Akihiko Odaki <odaki@....ci.i.u-tokyo.ac.jp>
Cc: Gurchetan Singh <gurchetansingh@...omium.org>,
Chia-I Wu <olvaffe@...il.com>, dri-devel@...ts.freedesktop.org,
virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
Honglei Huang <honglei1.huang@....com>, David Airlie <airlied@...hat.com>,
Ray.Huang@....com, Gerd Hoffmann <kraxel@...hat.com>,
Dmitry Osipenko <dmitry.osipenko@...labora.com>,
Thomas Zimmermann <tzimmermann@...e.de>, Maxime Ripard <mripard@...nel.org>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute
workloads
On 2026/1/16 19:03, Akihiko Odaki wrote:
> On 2026/01/16 19:32, Honglei Huang wrote:
>>
>>
>> On 2026/1/16 18:01, Akihiko Odaki wrote:
>>> On 2026/01/16 18:39, Honglei Huang wrote:
>>>>
>>>>
>>>> On 2026/1/16 16:54, Akihiko Odaki wrote:
>>>>> On 2026/01/16 16:20, Honglei Huang wrote:
>>>>>>
>>>>>>
>>>>>> On 2026/1/15 17:20, Akihiko Odaki wrote:
>>>>>>> On 2026/01/15 16:58, Honglei Huang wrote:
>>>>>>>> From: Honglei Huang <honghuan@....com>
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> This series adds virtio-gpu userptr support to enable ROCm native
>>>>>>>> context for compute workloads. The userptr feature allows the
>>>>>>>> host to
>>>>>>>> directly access guest userspace memory without memcpy overhead,
>>>>>>>> which is
>>>>>>>> essential for GPU compute performance.
>>>>>>>>
>>>>>>>> The userptr implementation provides buffer-based zero-copy
>>>>>>>> memory access.
>>>>>>>> This approach pins guest userspace pages and exposes them to the
>>>>>>>> host
>>>>>>>> via scatter-gather tables, enabling efficient compute operations.
>>>>>>>
>>>>>>> This description looks identical with what
>>>>>>> VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST does so there should be some
>>>>>>> explanation how it makes difference.
>>>>>>>
>>>>>>> I have already pointed out this when reviewing the QEMU
>>>>>>> patches[1], but I note that here too, since QEMU is just a
>>>>>>> middleman and this matter is better discussed by Linux and
>>>>>>> virglrenderer developers.
>>>>>>>
>>>>>>> [1] https://lore.kernel.org/qemu-devel/35a8add7-da49-4833-9e69-
>>>>>>> d213f52c771a@....com/
>>>>>>>
>>>>>>
>>>>>> Thanks for raising this important point about the distinction between
>>>>>> VIRTGPU_BLOB_FLAG_USE_USERPTR and VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST.
>>>>>> I might not have explained it clearly previously.
>>>>>>
>>>>>> The key difference is memory ownership and lifecycle:
>>>>>>
>>>>>> BLOB_MEM_HOST3D_GUEST:
>>>>>> - Kernel allocates memory (drm_gem_shmem_create)
>>>>>> - Userspace accesses via mmap(GEM_BO)
>>>>>> - Use case: Graphics resources (Vulkan/OpenGL)
>>>>>>
>>>>>> BLOB_FLAG_USE_USERPTR:
>>>>>> - Userspace pre-allocates memory (malloc/mmap)
>>>>>
>>>>> "Kernel allocates memory" and "userspace pre-allocates memory" is a
>>>>> bit ambiguous phrasing. Either way, the userspace requests the
>>>>> kernel to map memory with a system call, brk() or mmap().
>>>>
>>>> They are different:
>>>> BLOB_MEM_HOST3D_GUEST (kernel-managed pages):
>>>> - Allocated via drm_gem_shmem_create() as GFP_KERNEL pages
>>>> - Kernel guarantees pages won't swap or migrate while GEM object
>>>> exists
>>>> - Physical addresses remain stable → safe for DMA
>>>>
>>>> BLOB_FLAG_USE_USERPTR (userspace pages):
>>>> - From regular malloc/mmap - subject to MM policies
>>>> - Can be swapped, migrated, or compacted by kernel
>>>> - Requires FOLL_LONGTERM pinning to make DMA-safe
>>>>
>>>> The device must treat them differently. Kernel-managed pages have
>>>> stable physical
>>>> addresses. Userspace pages need explicit pinning and the device must
>>>> be prepared
>>>> for potential invalidation.
>>>>
>>>> This is why all compute drivers (amdgpu, i915, nouveau) implement
>>>> userptr - to
>>>> make arbitrary userspace allocations DMA-accessible while respecting
>>>> their different
>>>> page mobility characteristics.
>>>> And the drm already has a better frame work for it: SVM, and this
>>>> verions is a super simplified verion.
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
>>>> tree/ drivers/gpu/drm/
>>>> drm_gpusvm.c#:~:text=*%20GPU%20Shared%20Virtual%20Memory%20(GPU%20SVM)%20layer%20for%20the%20Direct%20Rendering%20Manager%20(DRM)
>>>
>>> I referred to phrasing "kernel allocates" vs "userspace allocates".
>>> Using GFP_KERNEL, swapping, migrating, or pinning is all what the
>>> kernel does.
>>
>> I am talking about the virtio gpu driver side, the virtio gpu driver
>> need handle those two type memory differently.
>>
>>>
>>>>
>>>>
>>>>>
>>>>>> - Kernel only get existing pages
>>>>>> - Use case: Compute workloads (ROCm/CUDA) with large datasets,
>>>>>> like
>>>>>> GPU needs load a big model file 10G+, UMD mmap the fd file, then
>>>>>> give the mmap ptr into userspace then driver do not need a another
>>>>>> copy.
>>>>>> But if the shmem is used, the userspace needs copy the file data
>>>>>> into a shmem mmap ptr there is a copy overhead.
>>>>>>
>>>>>> Userptr:
>>>>>>
>>>>>> file -> open/mmap -> userspace ptr -> driver
>>>>>>
>>>>>> shmem:
>>>>>>
>>>>>> user alloc shmem ──→ mmap shmem ──→ shmem userspace ptr -> driver
>>>>>> ↑
>>>>>> │ copy
>>>>>> │
>>>>>> file ──→ open/mmap ──→ file userptr ──────────┘
>>>>>>
>>>>>>
>>>>>> For compute workloads, this matters significantly:
>>>>>> Without userptr: malloc(8GB) → alloc GEM BO → memcpy 8GB →
>>>>>> compute → memcpy 8GB back
>>>>>> With userptr: malloc(8GB) → create userptr BO → compute
>>>>>> (zero- copy)
>>>>>
>>>>> Why don't you alloc GEM BO first and read the file into there?
>>>>
>>>> Because that defeats the purpose of zero-copy.
>>>>
>>>> With GEM-BO-first (what you suggest):
>>>>
>>>> void *gembo = virtgpu_gem_create(10GB); // Allocate GEM buffer
>>>> void *model = mmap(..., model_file_fd, 0); // Map model file
>>>> memcpy(gembo, model, 10GB); // Copy 10GB - NOT zero-
>>>> copy
>>>> munmap(model, 10GB);
>>>> gpu_compute(gembo);
>>>>
>>>> Result: 10GB copy overhead + double memory usage during copy.
>>>
>>> How about:
>>>
>>> void *gembo = virtgpu_gem_create(10GB);
>>> read(model_file_fd, gembo, 10GB);
>>
>> I believe there is still memory copy in read operation
>> model_file_fd -> gembo, they have different physical pages,
>> but the userptr/SVM feature will access the model_file_fd physical
>> pages directly.
>
> You can use O_DIRECT if you want.
>
>>
>>
>>>
>>> Result: zero-copy + simpler code.
>>>
>>>>
>>>> With userptr (zero-copy):
>>>>
>>>> void *model = mmap(..., model_file_fd, 0); // Map model file
>>>> hsa_memory_register(model, 10GB); // Pin pages, create
>>>> userptr BO
>>>> gpu_compute(model); // GPU reads directly
>>>> from file pages
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> The explicit flag serves three purposes:
>>>>>>
>>>>>> 1. Although both send scatter-gather entries to host. The flag
>>>>>> makes the intent unambiguous.
>>>>>
>>>>> Why will the host care?
>>>>
>>>> The flag tells host this is a userptr, host side need handle it
>>>> specially.
>>>
>>> Please provide the concrete requirement. What is the special handling
>>> the host side needs to perform?
>>
>> Every hardware has it own special API to handle userptr, for amdgpu ROCm
>> it is hsaKmtRegisterMemoryWithFlags.
>
> On the host side, BLOB_MEM_HOST3D_GUEST will always result in a
> userspace pointer. Below is how the address is translated:
>
> 1) (with the ioctl you are adding)
> Guest kernel translates guest userspace pointer to guest PA.
> 2) (with IOMMU)
> Guest kernel translates guest PA to device VA
> 3) The host VMM translates device VA to host userspace pointer
> 4) virglrenderer passes userspace pointer to the GPU API (ROCm)
>
> BLOB_FLAG_USE_USERPTR tells 1) happened. But the succeeding process is
> not affected by that.
>
>>
>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> 2. Ensures consistency between flag and userptr address field.
>>>>>
>>>>> Addresses are represented with the nr_entries and following struct
>>>>> virtio_gpu_mem_entry entries, whenever
>>>>> VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB or
>>>>> VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING is used. Having a special
>>>>> flag introduces inconsistency.
>>>>
>>>> For this part I am talking about the virito gpu guest UMD side, in
>>>> blob create io ctrl we need this flag to
>>>> check the userptr address and is it a read-only attribute:
>>>> if (rc_blob->blob_flags & VIRTGPU_BLOB_FLAG_USE_USERPTR) {
>>>> if (!rc_blob->userptr)
>>>> return -EINVAL;
>>>> } else {
>>>> if (rc_blob->userptr)
>>>> return -EINVAL;
>>>>
>>>> if (rc_blob->blob_flags & VIRTGPU_BLOB_FLAG_USERPTR_RDONLY)
>>>> return -EINVAL;
>>>> }
>>>
>>> I see. That shows VIRTGPU_BLOB_FLAG_USE_USERPTR is necessary for the
>>> ioctl.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> 3. Future HMM support: There is a plan to upgrade userptr
>>>>>> implementation to use Heterogeneous Memory Management for better
>>>>>> GPU coherency and dynamic page migration. The flag provides a
>>>>>> clean path to future upgrade.
>>>>>
>>>>> How will the upgrade path with the flag and the one without the
>>>>> flag look like, and in what aspect the upgrade path with the flag
>>>>> is "cleaner"?
>>>>
>>>> As I mentioned above the userptr handling is different with shmem/
>>>> GEM BO.
>>>
>>> All the above describes the guest-internal behavior. What about the
>>> interaction between the guest and host? How will virtio as a guest-
>>> host interface having VIRTIO_GPU_BLOB_FLAG_USE_USERPTR ease future
>>> upgrade?
>>
>> It depends on how we implement it, the current version is the simplest
>> implementation, similar to the implementation in Intel's i915.
>> If virtio side needs HMM to implement a SVM type userptr feature
>> I think VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is must needed, stack needs
>> to know if it is a userptr resource, and to perform advanced
>> operations such as updating page tables, splitting BOs, etc.
>
> Why do the device need to know if it is a userptr resource to perform
> operations when the device always get device VAs?
>
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> I understand the concern about API complexity. I'll defer to the
>>>>>> virtio- gpu maintainers for the final decision on whether this
>>>>>> design is acceptable or if they prefer an alternative approach.
>>>>>
>>>>> It is fine to have API complexity. The problem here is the lack of
>>>>> clear motivation and documentation.
>>>>>
>>>>> Another way to put this is: how will you explain the flag in the
>>>>> virtio specification? It should say "the driver MAY/SHOULD/MUST do
>>>>> something" and/or "the device MAY/SHOULD/MUST do something", and
>>>>> then Linux and virglrenderer can implement the flag accordingly.
>>>>
>>>> you're absolutely right that the specification should
>>>> be written in proper virtio spec language. The draft should be:
>>>>
>>>> VIRTIO_GPU_BLOB_FLAG_USE_USERPTR:
>>>>
>>>> Linux virtio driver requirements:
>>>> - MUST set userptr to valid guest userspace VA in
>>>> drm_virtgpu_resource_create_blob
>>>> - SHOULD keep VA mapping valid until resource destruction
>>>> - MUST pin pages or use HMM at blob creation time
>>>
>>> These descriptions are not for the virtio specification. The virtio
>>> specification describes the interaction between the driver and
>>> device. These statements describe the interaction between the guest
>>> userspace and the guest kernel.
>>>
>>>>
>>>> Virglrenderer requirements:
>>>> - must use correspoonding API for userptr resource
>>>
>>> What is the "corresponding API"?
>>
>> It may can be:
>> **VIRTIO_GPU_BLOB_FLAG_USE_USERPTR specification:**
>>
>> Driver requirements:
>> - MUST populate mem_entry[] with valid guest physical addresses of
>> pinned userspace pages
>
> "Userspace" is a the guest-internal concepts and irrelevant with the
> interaction between the driver and device.
>
>> - MUST set blob_mem to VIRTIO_GPU_BLOB_FLAG_USE_USERPTR when using
>> this flag
>
> When should the driver use the flag?
>
>> - SHOULD keep pages pinned until VIRTIO_GPU_CMD_RESOURCE_UNREF
>
> It is not a new requirement. The page must stay at the same position
> whether VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is used or not.
>
>>
>> Device requirements:
>> - MUST establish IOMMU mappings using the provided iovec array with
>> specific API.(hsaKmtRegisterMemoryWithFlags for ROCm)
>
> This should be also true even when VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is
> not set.
>
>>
>>
>>
>> Really thanks for your comments, and I believe we need some input of
>> virito gpu maintainers.
>>
>> VIRTIO_GPU_BLOB_FLAG_USE_USERPTR flag is a flag for how to use, and it
>> doen't conflict with VIRTGPU_BLOB_MEM_HOST3D_GUEST. Just like a
>> resource is used for VIRTGPU_BLOB_FLAG_USE_SHAREABLE but it can be a
>> guest resource or a host resource.
>>
>> If we don't have VIRTIO_GPU_BLOB_FLAG_USE_USERPTR flag, we may have some
>> resource conflict in host side, guest kernel can use 'userptr' param
>> to identify. But in host side the 'userptr' param is lost, we only
>> know it is just a guest flag resource.
>
> I still don't see why knowing it is a guest resource is insufficient for
> the host.
All right, I totally agreed with you.
And let virtio gpu maintainer/drm decide how to design the flag/params
maybe is better.
I believe the core gap between you and me is the concept of userptr/SVM.
What does userptr/SVM used for, it let GPU and CPU share the userspace
virtual address. Perhaps my description is not accurate enough.
>
> Regards,
> AKihiko Odaki
Powered by blists - more mailing lists