[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5b66df7d-374c-4e9c-88d5-bb514f9a7725@rsg.ci.i.u-tokyo.ac.jp>
Date: Thu, 15 Jan 2026 18:20:22 +0900
From: Akihiko Odaki <odaki@....ci.i.u-tokyo.ac.jp>
To: Honglei Huang <honglei1.huang@....com>, David Airlie
<airlied@...hat.com>,
Gerd Hoffmann <kraxel@...hat.com>,
Dmitry Osipenko <dmitry.osipenko@...labora.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
Simona Vetter <simona@...ll.ch>, Ray.Huang@....com
Cc: Gurchetan Singh <gurchetansingh@...omium.org>,
Chia-I Wu <olvaffe@...il.com>, dri-devel@...ts.freedesktop.org,
virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org,
Honglei Huang <honghuan@....com>
Subject: Re: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute
workloads
On 2026/01/15 16:58, Honglei Huang wrote:
> From: Honglei Huang <honghuan@....com>
>
> Hello,
>
> This series adds virtio-gpu userptr support to enable ROCm native
> context for compute workloads. The userptr feature allows the host to
> directly access guest userspace memory without memcpy overhead, which is
> essential for GPU compute performance.
>
> The userptr implementation provides buffer-based zero-copy memory access.
> This approach pins guest userspace pages and exposes them to the host
> via scatter-gather tables, enabling efficient compute operations.
This description looks identical with what
VIRTIO_GPU_BLOB_MEM_HOST3D_GUEST does so there should be some
explanation how it makes difference.
I have already pointed out this when reviewing the QEMU patches[1], but
I note that here too, since QEMU is just a middleman and this matter is
better discussed by Linux and virglrenderer developers.
[1]
https://lore.kernel.org/qemu-devel/35a8add7-da49-4833-9e69-d213f52c771a@amd.com/
>
> Key features:
> - Zero-copy memory access between guest userspace and host GPU
> - Read-only and read-write userptr support
> - Runtime feature detection via VIRTGPU_PARAM_RESOURCE_USERPTR
> - ROCm capset support for ROCm stack integration
> - Proper page lifecycle management with FOLL_LONGTERM pinning
>
> Patches overview:
> 1. Add VIRTIO_GPU_CAPSET_ROCM capability for compute workloads
> 2. Add virtio-gpu API definitions for userptr blob resources
> 3. Extend DRM UAPI with comprehensive userptr support
> 4. Implement core userptr functionality with page management
> 5. Integrate userptr into blob resource creation and advertise to userspace
>
> Performance: In popular compute benchmarks, this implementation achieves
> approximately 70% efficiency compared to bare metal OpenCL performance on
> AMD V2000 hardware, achieves 92% efficiency on AMD W7900 hardware.
>
> Testing: Verified with ROCm stack and OpenCL applications in VIRTIO virtualized
> environments.
> - Full OPENCL CTS tests passed on ROCm 5.7.0 in V2000 platform.
> - Near 70% percentage of OPENCL CTS tests passed on ROCm 7.0 W7900 platform.
> - most HIP catch tests passed on ROCm 7.0 W7900 platform.
> - Some AI applications enabled on ROCm 7.0 W7900 platform.
>
> V4 changes:
> - Renamed VIRTIO_GPU_CAPSET_HSAKMT to VIRTIO_GPU_CAPSET_ROCM
> - Remove userptr feature probing cause it can reuse the guest
> blob resource code path, reduce patch count from 6 to 5
> - Updated corresponding commit messages
> - Consolidated userptr feature detection in final patch
> - Update corresponding cover letter content
>
> V3 changes:
> - Split into focused patches for easier review
> - Removed complex interval tree userptr management
> - Simplified resource creation without deduplication
> - Added VIRTGPU_PARAM_RESOURCE_USERPTR for feature detection
> - Improved UAPI documentation and error handling
> - Enhanced code quality with proper cleanup paths
> - Removed MMU notifier dependencies for simplicity
> - Fixed resource lifecycle management issues
>
> V2: - Split add HSAKMT context and blob userptr resource to two patches.
> - Remove MMU notifier related patches, cause use not moveable user space
> memory with MMU notifier is not a good idea.
> - Remove HSAKMT context check when create context, let all the context
> support the userptr feature.
> - Remove MMU notifier related content in cover letter.
> - Add more comments for patch 6 in cover letter.
>
> Honglei Huang (5):
> drm/virtio-gpu: Add VIRTIO_GPU_CAPSET_ROCM capability
> virtio-gpu api: add blob userptr resource
> drm/virtgpu api: add blob userptr resource
> drm/virtio: implement userptr support for zero-copy memory access
> drm/virtio: advertise base userptr feature to userspace
>
> drivers/gpu/drm/virtio/Makefile | 3 +-
> drivers/gpu/drm/virtio/virtgpu_drv.h | 33 ++++
> drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 +-
> drivers/gpu/drm/virtio/virtgpu_object.c | 6 +
> drivers/gpu/drm/virtio/virtgpu_userptr.c | 231 +++++++++++++++++++++++
> include/uapi/drm/virtgpu_drm.h | 9 +
> include/uapi/linux/virtio_gpu.h | 7 +
> 7 files changed, 295 insertions(+), 3 deletions(-)
> create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c
>
Powered by blists - more mailing lists