[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157342641D173ABE9B4F1FED485A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 8 Jan 2026 18:45:52 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Yu Zhang <zhangyu1@...ux.microsoft.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>
CC: "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
<haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>, "lpieralisi@...nel.org"
<lpieralisi@...nel.org>, "kwilczynski@...nel.org" <kwilczynski@...nel.org>,
"mani@...nel.org" <mani@...nel.org>, "robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>,
"joro@...tes.org" <joro@...tes.org>, "will@...nel.org" <will@...nel.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"easwar.hariharan@...ux.microsoft.com"
<easwar.hariharan@...ux.microsoft.com>, "jacob.pan@...ux.microsoft.com"
<jacob.pan@...ux.microsoft.com>, "nunodasneves@...ux.microsoft.com"
<nunodasneves@...ux.microsoft.com>, "mrathor@...ux.microsoft.com"
<mrathor@...ux.microsoft.com>, "peterz@...radead.org" <peterz@...radead.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: RE: [RFC v1 0/5] Hyper-V: Add para-virtualized IOMMU support for
Linux guests
From: Yu Zhang <zhangyu1@...ux.microsoft.com> Sent: Monday, December 8, 2025 9:11 PM
>
> This patch series introduces a para-virtualized IOMMU driver for
> Linux guests running on Microsoft Hyper-V. The primary objective
> is to enable hardware-assisted DMA isolation and scalable device
Is there any particular meaning for the qualifier "scalable" vs. just
"device assignment"? I just want to understand what you are getting
at.
> assignment for Hyper-V child partitions, bypassing the performance
> overhead and complexity associated with emulated IOMMU hardware.
>
> The driver implements the following core functionality:
> * Hypercall-based Enumeration
> Unlike traditional ACPI-based discovery (e.g., DMAR/IVRS),
> this driver enumerates the Hyper-V IOMMU capabilities directly
> via hypercalls. This approach allows the guest to discover
> IOMMU presence and features without requiring specific virtual
> firmware extensions or modifications.
>
> * Domain Management
> The driver manages IOMMU domains through a new set of Hyper-V
> hypercall interfaces, handling domain allocation, attachment,
> and detachment for endpoint devices.
>
> * IOTLB Invalidation
> IOTLB invalidation requests are marshaled and issued to the
> hypervisor through the same hypercall mechanism.
>
> * Nested Translation Support
> This implementation leverages guest-managed stage-1 I/O page
> tables nested with host stage-2 translations. It is built
> upon the consolidated IOMMU page table framework designed by
> Jason Gunthorpe [1]. This design eliminates the need for complex
> emulation during map operations and ensures scalability across
> different architectures.
>
> Implementation Notes:
> * Architecture Independence
> While the current implementation only supports x86 platforms (Intel
> VT-d and AMD IOMMU), the driver design aims to be as architecture-
> agnostic as possible. To achieve this, initialization occurs via
> `device_initcall` rather than `x86_init.iommu.iommu_init`, and shutdown
> is handled via `syscore_ops` instead of `x86_platform.iommu_shutdown`.
>
> * MSI Region Handling
> In this RFC, the hardware MSI region is hard-coded to the standard
> x86 interrupt range (0xfee00000 - 0xfeefffff). Future updates may
> allow this configuration to be queried via hypercalls if new hardware
> platforms are to be supported.
>
> * Reserved Regions (RMRR)
> There is currently no requirement to support assigned devices with
> ACPI RMRR limitations. Consequently, this patch series does not specify
> or query reserved memory regions.
>
> Testing:
> This series has been validated using dmatest with Intel DSA devices
> assigned to the child partition. The tests confirmed successful DMA
> transactions under the para-virtualized IOMMU.
>
> Future Work:
> * Page-selective IOTLB Invalidation
> The current implementation relies on full-domain flushes. Support
> for page-selective invalidation is planned for a future series.
>
> * Advanced Features
> Support for vSVA and virtual PRI will be addressed in subsequent
> updates.
>
> * Root Partition Co-existence
> Ensure compatibility with the distinct para-virtualized IOMMU driver
> used by Hyper-V's Linux root partition, in which the DMA remapping
> is not achieved by stage-1 IO page tables and another set of iommu
> ops is provided.
>
> [1] https://github.com/jgunthorpe/linux/tree/iommu_pt_all
>
> Easwar Hariharan (2):
> PCI: hv: Create and export hv_build_logical_dev_id()
> iommu: Move Hyper-V IOMMU driver to its own subdirectory
>
> Wei Liu (1):
> hyperv: Introduce new hypercall interfaces used by Hyper-V guest IOMMU
>
> Yu Zhang (2):
> hyperv: allow hypercall output pages to be allocated for child
> partitions
> iommu/hyperv: Add para-virtualized IOMMU support for Hyper-V guest
>
> drivers/hv/hv_common.c | 21 +-
> drivers/iommu/Kconfig | 10 +-
> drivers/iommu/Makefile | 2 +-
> drivers/iommu/hyperv/Kconfig | 24 +
> drivers/iommu/hyperv/Makefile | 3 +
> drivers/iommu/hyperv/iommu.c | 608 ++++++++++++++++++
> drivers/iommu/hyperv/iommu.h | 53 ++
> .../irq_remapping.c} | 2 +-
> drivers/pci/controller/pci-hyperv.c | 28 +-
> include/asm-generic/mshyperv.h | 2 +
> include/hyperv/hvgdk_mini.h | 8 +
> include/hyperv/hvhdk_mini.h | 123 ++++
> 12 files changed, 850 insertions(+), 34 deletions(-)
> create mode 100644 drivers/iommu/hyperv/Kconfig
> create mode 100644 drivers/iommu/hyperv/Makefile
> create mode 100644 drivers/iommu/hyperv/iommu.c
> create mode 100644 drivers/iommu/hyperv/iommu.h
> rename drivers/iommu/{hyperv-iommu.c => hyperv/irq_remapping.c} (99%)
>
> --
> 2.49.0
Powered by blists - more mailing lists