linux-kernel - Re: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute workloads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b69439ec-0ebd-4527-873b-85b283e03888@rsg.ci.i.u-tokyo.ac.jp>
Date: Fri, 16 Jan 2026 22:22:22 +0900
From: Akihiko Odaki <odaki@....ci.i.u-tokyo.ac.jp>
To: Honglei Huang <honghuan@....com>
Cc: Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, dri-devel@...ts.freedesktop.org,
        virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
        Honglei Huang <honglei1.huang@....com>,
        David Airlie <airlied@...hat.com>, Ray.Huang@....com,
        Gerd Hoffmann <kraxel@...hat.com>,
        Dmitry Osipenko <dmitry.osipenko@...labora.com>,
        Thomas Zimmermann <tzimmermann@...e.de>,
        Maxime Ripard <mripard@...nel.org>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute
 workloads

On 2026/01/16 21:34, Honglei Huang wrote:
> 
> 
> On 2026/1/16 19:03, Akihiko Odaki wrote:
>> On 2026/01/16 19:32, Honglei Huang wrote:
>>>
>>>
>>> On 2026/1/16 18:01, Akihiko Odaki wrote:
>>>> On 2026/01/16 18:39, Honglei Huang wrote:
>>>>>
>>>>>
>>>>> On 2026/1/16 16:54, Akihiko Odaki wrote:
>>>>>> On 2026/01/16 16:20, Honglei Huang wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 2026/1/15 17:20, Akihiko Odaki wrote:
>>>>>>>> On 2026/01/15 16:58, Honglei Huang wrote:
>>>>>>>>> From: Honglei Huang <honghuan@....com>
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> This series adds virtio-gpu userptr support to enable ROCm native
>>>>>>>>> context for compute workloads. The userptr feature allows the 
>>>>>>>>> host to
>>>>>>>>> directly access guest userspace memory without memcpy overhead, 
>>>>>>>>> which is
>>>>>>>>> essential for GPU compute performance.
>>>>>>>>>
>>>>>>>>> The userptr implementation provides buffer-based zero-copy 
>>>>>>>>> memory access.
>>>>>>>>> This approach pins guest userspace pages and exposes them to 
>>>>>>>>> the host
>>>>>>>>> via scatter-gather tables, enabling efficient compute operations.
>>>>>>>>
>>>>>>>> This description looks identical with what 
>>>>>>>> VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST does so there should be some 
>>>>>>>> explanation how it makes difference.
>>>>>>>>
>>>>>>>> I have already pointed out this when reviewing the QEMU 
>>>>>>>> patches[1], but I note that here too, since QEMU is just a 
>>>>>>>> middleman and this matter is better discussed by Linux and 
>>>>>>>> virglrenderer developers.
>>>>>>>>
>>>>>>>> [1] https://lore.kernel.org/qemu-devel/35a8add7-da49-4833-9e69- 
>>>>>>>> d213f52c771a@....com/
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for raising this important point about the distinction 
>>>>>>> between
>>>>>>> VIRTGPU_BLOB_FLAG_USE_USERPTR and VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST.
>>>>>>> I might not have explained it clearly previously.
>>>>>>>
>>>>>>> The key difference is memory ownership and lifecycle:
>>>>>>>
>>>>>>> BLOB_MEM_HOST3D_GUEST:
>>>>>>>    - Kernel allocates memory (drm_gem_shmem_create)
>>>>>>>    - Userspace accesses via mmap(GEM_BO)
>>>>>>>    - Use case: Graphics resources (Vulkan/OpenGL)
>>>>>>>
>>>>>>> BLOB_FLAG_USE_USERPTR:
>>>>>>>    - Userspace pre-allocates memory (malloc/mmap)
>>>>>>
>>>>>> "Kernel allocates memory" and "userspace pre-allocates memory" is 
>>>>>> a bit ambiguous phrasing. Either way, the userspace requests the 
>>>>>> kernel to map memory with a system call, brk() or mmap().
>>>>>
>>>>> They are different:
>>>>> BLOB_MEM_HOST3D_GUEST (kernel-managed pages):
>>>>>    - Allocated via drm_gem_shmem_create() as GFP_KERNEL pages
>>>>>    - Kernel guarantees pages won't swap or migrate while GEM object 
>>>>> exists
>>>>>    - Physical addresses remain stable → safe for DMA
>>>>>
>>>>> BLOB_FLAG_USE_USERPTR (userspace pages):
>>>>>    - From regular malloc/mmap - subject to MM policies
>>>>>    - Can be swapped, migrated, or compacted by kernel
>>>>>    - Requires FOLL_LONGTERM pinning to make DMA-safe
>>>>>
>>>>> The device must treat them differently. Kernel-managed pages have 
>>>>> stable physical
>>>>> addresses. Userspace pages need explicit pinning and the device 
>>>>> must be prepared
>>>>> for potential invalidation.
>>>>>
>>>>> This is why all compute drivers (amdgpu, i915, nouveau) implement 
>>>>> userptr - to
>>>>> make arbitrary userspace allocations DMA-accessible while 
>>>>> respecting their different
>>>>> page mobility characteristics.
>>>>> And the drm already has a better frame work for it: SVM, and this 
>>>>> verions is a super simplified verion.
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ 
>>>>> tree/ drivers/gpu/drm/ 
>>>>> drm_gpusvm.c#:~:text=*%20GPU%20Shared%20Virtual%20Memory%20(GPU%20SVM)%20layer%20for%20the%20Direct%20Rendering%20Manager%20(DRM)
>>>>
>>>> I referred to phrasing "kernel allocates" vs "userspace allocates". 
>>>> Using GFP_KERNEL, swapping, migrating, or pinning is all what the 
>>>> kernel does.
>>>
>>> I am talking about the virtio gpu driver side, the virtio gpu driver 
>>> need handle those two type memory differently.
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>    - Kernel only get existing pages
>>>>>>>    - Use case: Compute workloads (ROCm/CUDA) with large datasets, 
>>>>>>> like
>>>>>>> GPU needs load a big model file 10G+, UMD mmap the fd file, then 
>>>>>>> give the mmap ptr into userspace then driver do not need a 
>>>>>>> another copy.
>>>>>>> But if the shmem is used, the userspace needs copy the file data 
>>>>>>> into a shmem mmap ptr there is a copy overhead.
>>>>>>>
>>>>>>> Userptr:
>>>>>>>
>>>>>>> file -> open/mmap -> userspace ptr -> driver
>>>>>>>
>>>>>>> shmem:
>>>>>>>
>>>>>>> user alloc shmem ──→ mmap shmem ──→ shmem userspace ptr -> driver
>>>>>>>                                                ↑
>>>>>>>                                                │ copy
>>>>>>>                                                │
>>>>>>> file ──→ open/mmap ──→ file userptr ──────────┘
>>>>>>>
>>>>>>>
>>>>>>> For compute workloads, this matters significantly:
>>>>>>>    Without userptr: malloc(8GB) → alloc GEM BO → memcpy 8GB → 
>>>>>>> compute → memcpy 8GB back
>>>>>>>    With userptr:    malloc(8GB) → create userptr BO → compute 
>>>>>>> (zero- copy)
>>>>>>
>>>>>> Why don't you alloc GEM BO first and read the file into there?
>>>>>
>>>>> Because that defeats the purpose of zero-copy.
>>>>>
>>>>> With GEM-BO-first (what you suggest):
>>>>>
>>>>> void *gembo = virtgpu_gem_create(10GB);     // Allocate GEM buffer
>>>>> void *model = mmap(..., model_file_fd, 0);  // Map model file
>>>>> memcpy(gembo, model, 10GB);                 // Copy 10GB - NOT 
>>>>> zero- copy
>>>>> munmap(model, 10GB);
>>>>> gpu_compute(gembo);
>>>>>
>>>>> Result: 10GB copy overhead + double memory usage during copy.
>>>>
>>>> How about:
>>>>
>>>> void *gembo = virtgpu_gem_create(10GB);
>>>> read(model_file_fd, gembo, 10GB);
>>>
>>> I believe there is still memory copy in read operation
>>> model_file_fd -> gembo, they have different physical pages,
>>> but the userptr/SVM feature will access the model_file_fd physical 
>>> pages directly.
>>
>> You can use O_DIRECT if you want.
>>
>>>
>>>
>>>>
>>>> Result: zero-copy + simpler code.
>>>>
>>>>>
>>>>> With userptr (zero-copy):
>>>>>
>>>>> void *model = mmap(..., model_file_fd, 0);  // Map model file
>>>>> hsa_memory_register(model, 10GB);           // Pin pages, create 
>>>>> userptr BO
>>>>> gpu_compute(model);                         // GPU reads directly 
>>>>> from file pages
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> The explicit flag serves three purposes:
>>>>>>>
>>>>>>> 1. Although both send scatter-gather entries to host. The flag 
>>>>>>> makes the intent unambiguous.
>>>>>>
>>>>>> Why will the host care?
>>>>>
>>>>> The flag tells host this is a userptr, host side need handle it 
>>>>> specially.
>>>>
>>>> Please provide the concrete requirement. What is the special 
>>>> handling the host side needs to perform?
>>>
>>> Every hardware has it own special API to handle userptr, for amdgpu ROCm
>>> it is hsaKmtRegisterMemoryWithFlags.
>>
>> On the host side, BLOB_MEM_HOST3D_GUEST will always result in a 
>> userspace pointer. Below is how the address is translated:
>>
>> 1) (with the ioctl you are adding)
>>     Guest kernel translates guest userspace pointer to guest PA.
>> 2) (with IOMMU)
>>     Guest kernel translates guest PA to device VA
>> 3) The host VMM translates device VA to host userspace pointer
>> 4) virglrenderer passes userspace pointer to the GPU API (ROCm)
>>
>> BLOB_FLAG_USE_USERPTR tells 1) happened. But the succeeding process is 
>> not affected by that.
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> 2. Ensures consistency between flag and userptr address field.
>>>>>>
>>>>>> Addresses are represented with the nr_entries and following struct 
>>>>>> virtio_gpu_mem_entry entries, whenever 
>>>>>> VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB or 
>>>>>> VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING is used. Having a special 
>>>>>> flag introduces inconsistency.
>>>>>
>>>>> For this part I am talking about the virito gpu guest UMD side, in 
>>>>> blob create io ctrl we need this flag to
>>>>> check the userptr address and is it a read-only attribute:
>>>>>      if (rc_blob->blob_flags & VIRTGPU_BLOB_FLAG_USE_USERPTR) {
>>>>>          if (!rc_blob->userptr)
>>>>>              return -EINVAL;
>>>>>      } else {
>>>>>          if (rc_blob->userptr)
>>>>>              return -EINVAL;
>>>>>
>>>>>          if (rc_blob->blob_flags & VIRTGPU_BLOB_FLAG_USERPTR_RDONLY)
>>>>>              return -EINVAL;
>>>>>      }
>>>>
>>>> I see. That shows VIRTGPU_BLOB_FLAG_USE_USERPTR is necessary for the 
>>>> ioctl.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> 3. Future HMM support: There is a plan to upgrade userptr 
>>>>>>> implementation to use Heterogeneous Memory Management for better 
>>>>>>> GPU coherency and dynamic page migration. The flag provides a 
>>>>>>> clean path to future upgrade.
>>>>>>
>>>>>> How will the upgrade path with the flag and the one without the 
>>>>>> flag look like, and in what aspect the upgrade path with the flag 
>>>>>> is "cleaner"?
>>>>>
>>>>> As I mentioned above the userptr handling is different with shmem/ 
>>>>> GEM BO.
>>>>
>>>> All the above describes the guest-internal behavior. What about the 
>>>> interaction between the guest and host? How will virtio as a guest- 
>>>> host interface having VIRTIO_GPU_BLOB_FLAG_USE_USERPTR ease future 
>>>> upgrade?
>>>
>>> It depends on how we implement it, the current version is the 
>>> simplest implementation, similar to the implementation in Intel's i915.
>>> If virtio side needs HMM to implement a SVM type userptr feature
>>> I think VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is must needed, stack needs 
>>> to know if it is a userptr resource, and to perform advanced 
>>> operations such as updating page tables, splitting BOs, etc.
>>
>> Why do the device need to know if it is a userptr resource to perform 
>> operations when the device always get device VAs?
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> I understand the concern about API complexity. I'll defer to the 
>>>>>>> virtio- gpu maintainers for the final decision on whether this 
>>>>>>> design is acceptable or if they prefer an alternative approach.
>>>>>>
>>>>>> It is fine to have API complexity. The problem here is the lack of 
>>>>>> clear motivation and documentation.
>>>>>>
>>>>>> Another way to put this is: how will you explain the flag in the 
>>>>>> virtio specification? It should say "the driver MAY/SHOULD/MUST do 
>>>>>> something" and/or "the device MAY/SHOULD/MUST do something", and 
>>>>>> then Linux and virglrenderer can implement the flag accordingly.
>>>>>
>>>>> you're absolutely right that the specification should
>>>>> be written in proper virtio spec language. The draft should be:
>>>>>
>>>>> VIRTIO_GPU_BLOB_FLAG_USE_USERPTR:
>>>>>
>>>>> Linux virtio driver requirements:
>>>>> - MUST set userptr to valid guest userspace VA in 
>>>>> drm_virtgpu_resource_create_blob
>>>>> - SHOULD keep VA mapping valid until resource destruction
>>>>> - MUST pin pages or use HMM at blob creation time
>>>>
>>>> These descriptions are not for the virtio specification. The virtio 
>>>> specification describes the interaction between the driver and 
>>>> device. These statements describe the interaction between the guest 
>>>> userspace and the guest kernel.
>>>>
>>>>>
>>>>> Virglrenderer requirements:
>>>>> - must use correspoonding API for userptr resource
>>>>
>>>> What is the "corresponding API"?
>>>
>>> It may can be:
>>> **VIRTIO_GPU_BLOB_FLAG_USE_USERPTR specification:**
>>>
>>> Driver requirements:
>>> - MUST populate mem_entry[] with valid guest physical addresses of 
>>> pinned userspace pages
>>
>> "Userspace" is a the guest-internal concepts and irrelevant with the 
>> interaction between the driver and device.
>>
>>> - MUST set blob_mem to VIRTIO_GPU_BLOB_FLAG_USE_USERPTR when using 
>>> this flag
>>
>> When should the driver use the flag?
>>
>>> - SHOULD keep pages pinned until VIRTIO_GPU_CMD_RESOURCE_UNREF
>>
>> It is not a new requirement. The page must stay at the same position 
>> whether VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is used or not.
>>
>>>
>>> Device requirements:
>>> - MUST establish IOMMU mappings using the provided iovec array with 
>>> specific API.(hsaKmtRegisterMemoryWithFlags for ROCm)
>>
>> This should be also true even when VIRTIO_GPU_BLOB_FLAG_USE_USERPTR is 
>> not set.
>>
>>>
>>>
>>>
>>> Really thanks for your comments, and I believe we need some input of
>>> virito gpu maintainers.
>>>
>>> VIRTIO_GPU_BLOB_FLAG_USE_USERPTR flag is a flag for how to use, and 
>>> it doen't conflict with VIRTGPU_BLOB_MEM_HOST3D_GUEST. Just like a 
>>> resource is used for VIRTGPU_BLOB_FLAG_USE_SHAREABLE but it can be a 
>>> guest resource or a host resource.
>>>
>>> If we don't have VIRTIO_GPU_BLOB_FLAG_USE_USERPTR flag, we may have some
>>> resource conflict in host side, guest kernel can use 'userptr' param 
>>> to identify. But in host side the 'userptr' param is lost, we only 
>>> know it is just a guest flag resource.
>>
>> I still don't see why knowing it is a guest resource is insufficient 
>> for the host.
> 
> All right, I totally agreed with you.
> 
> And let virtio gpu maintainer/drm decide how to design the flag/params 
> maybe is better.
> 
> 
> I believe the core gap between you and me is the concept of userptr/SVM.
> What does userptr/SVM used for, it let GPU and CPU share the userspace 
> virtual address. Perhaps my description is not accurate enough.

That is not what your QEMU patch series does; QEMU sees an address space 
bound to the virtio-gpu device which is not the guest userspace virtual 
address space.

Below is my points in the discussion:

- Zero copy is not a new thing, but virtio already has features for
   that: VIRTIO_GPU_BLOB_MEM_GUEST and VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST.

- You *always* need hsaKmtRegisterMemoryWithFlags() or similar when
   implementing VIRTIO_GPU_BLOB_MEM_GUEST and/or
   VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST, so having another flag does not make
   any difference.

- The guest userspace virtual address is never exposed to the host in
   your QEMU patch series in contrary to your description.

Regards,
Akihiko Odaki