[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251112074548.3718563-1-honglei1.huang@amd.com>
Date: Wed, 12 Nov 2025 15:45:42 +0800
From: Honglei Huang <honglei1.huang@....com>
To: David Airlie <airlied@...hat.com>, Gerd Hoffmann <kraxel@...hat.com>,
Dmitry Osipenko <dmitry.osipenko@...labora.com>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>, Simona Vetter <simona@...ll.ch>,
<Ray.Huang@....com>
CC: Gurchetan Singh <gurchetansingh@...omium.org>, Chia-I Wu
<olvaffe@...il.com>, <dri-devel@...ts.freedesktop.org>,
<virtualization@...ts.linux.dev>, <linux-kernel@...r.kernel.org>, "Honglei
Huang" <Honglei1.Huang@....com>
Subject:
From: Honglei Huang <Honglei1.Huang@....com>
Subject: [PATCH v3 0/6] virtio-gpu: Add userptr support for compute workloads
Hello,
This series adds virtio-gpu userptr support to enable ROCm/OpenCL native
context for compute workloads. The userptr feature allows the host to
directly access guest userspace memory without memcpy overhead, which is
essential for GPU compute performance.
The userptr implementation provides buffer-based zero-copy memory access
rather than SVM (Shared Virtual Memory). This approach pins guest userspace
pages and exposes them to the host via scatter-gather tables, enabling
efficient compute operations while maintaining memory safety.
Key features:
- Zero-copy memory access between guest userspace and host GPU
- Read-only and read-write userptr support
- Runtime feature detection via VIRTGPU_PARAM_RESOURCE_USERPTR
- HSAKMT context support for ROCm/HSA stack integration
- Proper page lifecycle management with FOLL_LONGTERM pinning
Patches overview:
1. Add HSAKMT context capset for compute workloads
2. Add virtio-gpu API definitions for userptr blob resources
3. Extend DRM UAPI with comprehensive userptr support
4. Add feature probing and capability advertisement
5. Implement core userptr functionality with page management
6. Integrate userptr into blob resource creation path
V3 improvements over V2:
- Simplified implementation by removing interval tree management
- Better patch organization with clear functional separation
- Improved UAPI documentation with detailed field descriptions
- Enhanced error handling and resource cleanup
- Removed complex resource deduplication logic for maintainability
- Added runtime feature detection parameter
- Fixed memory management issues and improved code quality
Performance: In popular compute benchmarks, this implementation achieves
approximately 70% efficiency compared to bare metal OpenCL performance on
AMD V2000 hardware, achieves 92% efficiency on AMD W7900 hardware.
Testing: Verified with ROCm stack and OpenCL applications in VIRTIO virtualized
environments.
- Full OPENCL CTS tests passed on ROCm 5.7.0 in V2000 platform.
- Near 70% percentage of OPENCL CTS tests passed on ROCm 7.0 W7900 platform.
- 50% HIP catch tests passed on ROCm 7.0 W7900 platform.
- Some AI applications enabled on ROCm 7.0 W7900 platform.
V3 changes:
- Split into focused patches for easier review
- Removed complex interval tree userptr management
- Simplified resource creation without deduplication
- Added VIRTGPU_PARAM_RESOURCE_USERPTR for feature detection
- Improved UAPI documentation and error handling
- Enhanced code quality with proper cleanup paths
- Removed MMU notifier dependencies for simplicity
- Fixed resource lifecycle management issues
Honglei Huang (6):
virtio-gpu api: add HSAKMT context
virtio-gpu api: add blob userptr resource
drm/virtgpu api: add blob userptr resource
drm/virtio: implement userptr: probe for the feature
drm/virtio: implement userptr support for zero-copy memory access
drm/virtio: advertise base userptr feature to userspace
drivers/gpu/drm/virtio/Makefile | 3 +-
drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
drivers/gpu/drm/virtio/virtgpu_drv.c | 1 +
drivers/gpu/drm/virtio/virtgpu_drv.h | 34 ++++
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 14 +-
drivers/gpu/drm/virtio/virtgpu_kms.c | 8 +-
drivers/gpu/drm/virtio/virtgpu_object.c | 7 +-
drivers/gpu/drm/virtio/virtgpu_userptr.c | 231 +++++++++++++++++++++++
include/uapi/drm/virtgpu_drm.h | 10 +
include/uapi/linux/virtio_gpu.h | 7 +
10 files changed, 310 insertions(+), 6 deletions(-)
create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c
--
2.34.1
Powered by blists - more mailing lists