[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260115075851.173318-1-honglei1.huang@amd.com>
Date: Thu, 15 Jan 2026 15:58:46 +0800
From: Honglei Huang <honglei1.huang@....com>
To: David Airlie <airlied@...hat.com>, Gerd Hoffmann <kraxel@...hat.com>,
Dmitry Osipenko <dmitry.osipenko@...labora.com>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, Simona Vetter <simona@...ll.ch>,
<Ray.Huang@....com>
CC: Gurchetan Singh <gurchetansingh@...omium.org>,
<odaki@....ci.i.u-tokyo.ac.jp>, Chia-I Wu <olvaffe@...il.com>,
<dri-devel@...ts.freedesktop.org>, <virtualization@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>, Honglei Huang <honghuan@....com>
Subject: [PATCH v4 0/5] virtio-gpu: Add userptr support for compute workloads
From: Honglei Huang <honghuan@....com>
Hello,
This series adds virtio-gpu userptr support to enable ROCm native
context for compute workloads. The userptr feature allows the host to
directly access guest userspace memory without memcpy overhead, which is
essential for GPU compute performance.
The userptr implementation provides buffer-based zero-copy memory access.
This approach pins guest userspace pages and exposes them to the host
via scatter-gather tables, enabling efficient compute operations.
Key features:
- Zero-copy memory access between guest userspace and host GPU
- Read-only and read-write userptr support
- Runtime feature detection via VIRTGPU_PARAM_RESOURCE_USERPTR
- ROCm capset support for ROCm stack integration
- Proper page lifecycle management with FOLL_LONGTERM pinning
Patches overview:
1. Add VIRTIO_GPU_CAPSET_ROCM capability for compute workloads
2. Add virtio-gpu API definitions for userptr blob resources
3. Extend DRM UAPI with comprehensive userptr support
4. Implement core userptr functionality with page management
5. Integrate userptr into blob resource creation and advertise to userspace
Performance: In popular compute benchmarks, this implementation achieves
approximately 70% efficiency compared to bare metal OpenCL performance on
AMD V2000 hardware, achieves 92% efficiency on AMD W7900 hardware.
Testing: Verified with ROCm stack and OpenCL applications in VIRTIO virtualized
environments.
- Full OPENCL CTS tests passed on ROCm 5.7.0 in V2000 platform.
- Near 70% percentage of OPENCL CTS tests passed on ROCm 7.0 W7900 platform.
- most HIP catch tests passed on ROCm 7.0 W7900 platform.
- Some AI applications enabled on ROCm 7.0 W7900 platform.
V4 changes:
- Renamed VIRTIO_GPU_CAPSET_HSAKMT to VIRTIO_GPU_CAPSET_ROCM
- Remove userptr feature probing cause it can reuse the guest
blob resource code path, reduce patch count from 6 to 5
- Updated corresponding commit messages
- Consolidated userptr feature detection in final patch
- Update corresponding cover letter content
V3 changes:
- Split into focused patches for easier review
- Removed complex interval tree userptr management
- Simplified resource creation without deduplication
- Added VIRTGPU_PARAM_RESOURCE_USERPTR for feature detection
- Improved UAPI documentation and error handling
- Enhanced code quality with proper cleanup paths
- Removed MMU notifier dependencies for simplicity
- Fixed resource lifecycle management issues
V2: - Split add HSAKMT context and blob userptr resource to two patches.
- Remove MMU notifier related patches, cause use not moveable user space
memory with MMU notifier is not a good idea.
- Remove HSAKMT context check when create context, let all the context
support the userptr feature.
- Remove MMU notifier related content in cover letter.
- Add more comments for patch 6 in cover letter.
Honglei Huang (5):
drm/virtio-gpu: Add VIRTIO_GPU_CAPSET_ROCM capability
virtio-gpu api: add blob userptr resource
drm/virtgpu api: add blob userptr resource
drm/virtio: implement userptr support for zero-copy memory access
drm/virtio: advertise base userptr feature to userspace
drivers/gpu/drm/virtio/Makefile | 3 +-
drivers/gpu/drm/virtio/virtgpu_drv.h | 33 ++++
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 +-
drivers/gpu/drm/virtio/virtgpu_object.c | 6 +
drivers/gpu/drm/virtio/virtgpu_userptr.c | 231 +++++++++++++++++++++++
include/uapi/drm/virtgpu_drm.h | 9 +
include/uapi/linux/virtio_gpu.h | 7 +
7 files changed, 295 insertions(+), 3 deletions(-)
create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c
--
2.34.1
Powered by blists - more mailing lists