[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240826-iopf-for-all-v1-0-59174e6a7528@samsung.com>
Date: Mon, 26 Aug 2024 13:40:26 +0200
From: Klaus Jensen <its@...elevant.dk>
To: David Woodhouse <dwmw2@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Jason Gunthorpe <jgg@...pe.ca>, Kevin Tian <kevin.tian@...el.com>
Cc: Minwoo Im <minwoo.im@...sung.com>, linux-kernel@...r.kernel.org,
iommu@...ts.linux.dev, Joel Granados <j.granados@...sung.com>,
Klaus Jensen <k.jensen@...sung.com>
Subject: [PATCH RFC PREVIEW 0/6] iommu: enable user space iopfs in
non-nested and non-svm cases
This is a Request for Comment series that will hopefully generate
initial feedback on the use of the iommufd_hwpt_replace_device to
execute non-nested and non-svm user space IOPFs. Our main motivation is
to enable user-space driver driven device verification with default
pasid and without nesting nor SVM.
What?
* Enable IO page fault handling in user space in a non-nested, non-svm
and non-virtualised use case.
* Removing the relation between IOPF and INTEL_IOMMU_SVM by allowing
the user to (de)select the IOPF code through Kconfig.
* Create a new file under iommu/intel (prq.c) that contains all the
page request queue related logic and is not under intel/svm.c.
* Add the IOMMU_HWPT_FAULT_ID_VALID to the valid flags used to create
IOMMU_HWPT_ALLOC allocations.
* Create a default (zero) pasid handle and insert it to the pasid
array within the dev->iommu_group when replacing the old HWPT with
an iopf enabled HWPT.
Why?
The PCI ATS Extended Capability allows peripheral devices to
participate in the caching of translations when operating under an
IOMMU. Further, the ATS Page Request Interface (PRI) Extension allows
devices to handle missing mappings. Currently, PRI is mainly used in
the context of Shared Virtual Addressing, requiring support for the
Process Address Space Identifier (PASID) capability, but other use
cases such as enabling user-space driver driven device verification
and reducing memory pinning exists. This patchest sets out to enable
these use cases.
Testing?
The non-nested/non-svm IOPF interface is exercised by first
initializing an iopf enabled ioas and then reading the fault file
descriptor. Pseudocode on the iopf initializing and handling is in [3]
and [4] (using libvfn).
Supplementary repositories supporting this patchset:
1. A user space library libvfn [1] which is used for testing and
verification (see examples/iopf.c), and
2. Basic emulation of PCIe ATS/PRI and Intel VT-d PRQ in QEMU [2].
Notes
Patches 5/6 are added by Klaus for testing against the QEMU test
device (which does not support PASID). They are very much RFC.
Comments and feedback are greatly appreciated
Best
Joel
PS: I'm on PTO, so my answers might be delayed (back September 2nd). But
I'll give priority to answer any questions or feedback when I see
it.
[1] https://github.com/SamsungDS/libvfn/tree/iommufd-fault-queue
[2] https://gitlab.com/birkelund/qemu/-/tree/pcie-ats-pri
[3] Initializing
```
int iopf_init(struct iommu_ioas *ioas, const char *bdf)
{
// open vfio device from bdf
int devfd = open('/dev/vfio/devices/VFIO_DEV', O_RDWR);
struct vfio_device_bind_iommufd bind = {
.argsz = sizeof(bind),
.flags = 0,
.iommufd = __iommufd,
};
ioctl(devfd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
struct iommu_ioas *ioas = ioas;
struct vfio_device_attach_iommufd_pt attach_data = {
.argsz = sizeof(attach_data),
.flags = 0,
.pt_id = ioas->id,
};
ioctl(devfd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
struct iommu_fault_alloc fault = {
.size = sizeof(fault),
.flags = 0,
};
ioctl(__iommufd, IOMMU_FAULT_QUEUE_ALLOC, &fault);
struct iommu_hwpt_alloc fault_cmd = {
.size = sizeof(fault_cmd),
.flags = IOMMU_HWPT_FAULT_ID_VALID,
.dev_id = bind.out_devid,
.pt_id = ioas->id,
.data_len = 0,
.data_uptr = (uint64_t)NULL,
.fault_id = fault.out_fault_id,
.__reserved = 0,
};
ioctl(__iommufd, IOMMU_HWPT_ALLOC, &fault_cmd);
// This is a re-attach
struct vfio_device_attach_iommufd_pt attach = {
.argsz = sizeof(attach),
.flags = 0,
.pt_id = fault_cmd.out_hwpt_id
};
ioctl(dev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach);
}
```
[4] Handling
```
int handle_iopf(void *vaddr, int len, uint64_t iova) {
exec_command(CMD)
int iopf_fd = fault_cmd.fault_id;
struct iommu_hwpt_pgfault pgfault = {0};
if(read(iopf_fd, &pgfault, sizeof(pgfault)) == 0);
return; // no page fault
ret = iommu_map_vaddr(__iommmufd, vaddr, len, &iova)
struct iommu_hwpt_page_response pgfault_response = {
.cookie = pgfault.cookie,
.code = ret ? IOMMUFD_PAGE_RESP_SUCCESS : IOMMUFD_PAGE_RESP_INVALID,
};
write(iopf_fd, &pgfault_response, sizeof(pgfault_response));
return;
}
```
Signed-off-by: Joel Granados <j.granados@...sung.com>
Signed-off-by: Klaus Jensen <k.jensen@...sung.com>
---
Joel Granados (4):
iommu/vt-d: Separate page request queue from SVM
iommu: Make IOMMU_IOPF selectable in Kconfig
iommufd: Enable PRI when doing the iommufd_hwpt_alloc
iommu: init pasid array while doing domain_replace and iopf is active
Klaus Jensen (2):
iommu/vt-d: drop pasid requirement for prq initialization
iommu/vt-d: do not require a PASID in page requests
drivers/iommu/Kconfig | 2 +-
drivers/iommu/intel/Kconfig | 1 -
drivers/iommu/intel/Makefile | 2 +-
drivers/iommu/intel/iommu.c | 29 ++--
drivers/iommu/intel/iommu.h | 40 ++++-
drivers/iommu/intel/prq.c | 284 ++++++++++++++++++++++++++++++++
drivers/iommu/intel/svm.c | 308 -----------------------------------
drivers/iommu/iommu-priv.h | 3 +
drivers/iommu/iommu.c | 31 ++++
drivers/iommu/iommufd/fault.c | 22 +++
drivers/iommu/iommufd/hw_pagetable.c | 3 +-
11 files changed, 389 insertions(+), 336 deletions(-)
---
base-commit: 3d5f968a177d468cd13568ef901c5be84d83d32b
change-id: 20240823-iopf-for-all-3b19075efc32
Best regards,
--
Klaus Jensen <k.jensen@...sung.com>
Powered by blists - more mailing lists