lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5b66df7d-374c-4e9c-88d5-bb514f9a7725@rsg.ci.i.u-tokyo.ac.jp>
Date: Thu, 15 Jan 2026 18:20:22 +0900
From: Akihiko Odaki <odaki@....ci.i.u-tokyo.ac.jp>
To: Honglei Huang <honglei1.huang@....com>, David Airlie
 <airlied@...hat.com>,
        Gerd Hoffmann <kraxel@...hat.com>,
        Dmitry Osipenko <dmitry.osipenko@...labora.com>,
        Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
        Maxime Ripard <mripard@...nel.org>,
        Thomas Zimmermann <tzimmermann@...e.de>,
        Simona Vetter <simona@...ll.ch>, Ray.Huang@....com
Cc: Gurchetan Singh <gurchetansingh@...omium.org>,
        Chia-I Wu <olvaffe@...il.com>, dri-devel@...ts.freedesktop.org,
        virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
        Honglei Huang <honghuan@....com>
Subject: Re: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute
 workloads

On 2026/01/15 16:58, Honglei Huang wrote:
> From: Honglei Huang <honghuan@....com>
> 
> Hello,
> 
> This series adds virtio-gpu userptr support to enable ROCm native
> context for compute workloads. The userptr feature allows the host to
> directly access guest userspace memory without memcpy overhead, which is
> essential for GPU compute performance.
> 
> The userptr implementation provides buffer-based zero-copy memory access.
> This approach pins guest userspace pages and exposes them to the host
> via scatter-gather tables, enabling efficient compute operations.

This description looks identical with what 
VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST does so there should be some 
explanation how it makes difference.

I have already pointed out this when reviewing the QEMU patches[1], but 
I note that here too, since QEMU is just a middleman and this matter is 
better discussed by Linux and virglrenderer developers.

[1] 
https://lore.kernel.org/qemu-devel/35a8add7-da49-4833-9e69-d213f52c771a@amd.com/

> 
> Key features:
> - Zero-copy memory access between guest userspace and host GPU
> - Read-only and read-write userptr support
> - Runtime feature detection via VIRTGPU_PARAM_RESOURCE_USERPTR
> - ROCm capset support for ROCm stack integration
> - Proper page lifecycle management with FOLL_LONGTERM pinning
> 
> Patches overview:
> 1. Add VIRTIO_GPU_CAPSET_ROCM capability for compute workloads
> 2. Add virtio-gpu API definitions for userptr blob resources
> 3. Extend DRM UAPI with comprehensive userptr support
> 4. Implement core userptr functionality with page management
> 5. Integrate userptr into blob resource creation and advertise to userspace
> 
> Performance: In popular compute benchmarks, this implementation achieves
> approximately 70% efficiency compared to bare metal OpenCL performance on
> AMD V2000 hardware, achieves 92% efficiency on AMD W7900 hardware.
> 
> Testing: Verified with ROCm stack and OpenCL applications in VIRTIO virtualized
> environments.
> - Full OPENCL CTS tests passed on ROCm 5.7.0 in V2000 platform.
> - Near 70% percentage of OPENCL CTS tests passed on ROCm 7.0 W7900 platform.
> - most HIP catch tests passed on ROCm 7.0 W7900 platform.
> - Some AI applications enabled on ROCm 7.0 W7900 platform.
> 
> V4 changes:
>      - Renamed VIRTIO_GPU_CAPSET_HSAKMT to VIRTIO_GPU_CAPSET_ROCM
>      - Remove userptr feature probing cause it can reuse the guest
>        blob resource code path, reduce patch count from 6 to 5
>      - Updated corresponding commit messages
>      - Consolidated userptr feature detection in final patch
>      - Update corresponding cover letter content
> 
> V3 changes:
>      - Split into focused patches for easier review
>      - Removed complex interval tree userptr management
>      - Simplified resource creation without deduplication
>      - Added VIRTGPU_PARAM_RESOURCE_USERPTR for feature detection
>      - Improved UAPI documentation and error handling
>      - Enhanced code quality with proper cleanup paths
>      - Removed MMU notifier dependencies for simplicity
>      - Fixed resource lifecycle management issues
> 
> V2: - Split add HSAKMT context and blob userptr resource to two patches.
>      - Remove MMU notifier related patches, cause use not moveable user space
>        memory with MMU notifier is not a good idea.
>      - Remove HSAKMT context check when create context, let all the context
>        support the userptr feature.
>      - Remove MMU notifier related content in cover letter.
>      - Add more comments  for patch 6 in cover letter.
> 
> Honglei Huang (5):
>    drm/virtio-gpu: Add VIRTIO_GPU_CAPSET_ROCM capability
>    virtio-gpu api: add blob userptr resource
>    drm/virtgpu api: add blob userptr resource
>    drm/virtio: implement userptr support for zero-copy memory access
>    drm/virtio: advertise base userptr feature to userspace
> 
>   drivers/gpu/drm/virtio/Makefile          |   3 +-
>   drivers/gpu/drm/virtio/virtgpu_drv.h     |  33 ++++
>   drivers/gpu/drm/virtio/virtgpu_ioctl.c   |   9 +-
>   drivers/gpu/drm/virtio/virtgpu_object.c  |   6 +
>   drivers/gpu/drm/virtio/virtgpu_userptr.c | 231 +++++++++++++++++++++++
>   include/uapi/drm/virtgpu_drm.h           |   9 +
>   include/uapi/linux/virtio_gpu.h          |   7 +
>   7 files changed, 295 insertions(+), 3 deletions(-)
>   create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ