lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251126062124.117425-1-val@invisiblethingslab.com>
Date: Wed, 26 Nov 2025 02:54:57 -0300
From: Val Packett <val@...isiblethingslab.com>
To: Juergen Gross <jgross@...e.com>,
	Stefano Stabellini <sstabellini@...nel.org>,
	Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
Cc: Val Packett <val@...isiblethingslab.com>,
	xen-devel@...ts.xenproject.org,
	linux-kernel@...r.kernel.org
Subject: [RFC PATCH v2] xen: privcmd: fix ioeventfd crash under PV domain

Starting a virtio backend in a PV domain would panic the kernel in
alloc_ioreq, trying to dereference vma->vm_private_data as a pages
pointer when in reality it stayed as PRIV_VMA_LOCKED.

Avoid crashing by handling the PRIV_VMA_LOCKED case in alloc_ioreq.
PV support requires mapping the virtio ioreq page explicitly into
the kernel's page tables, so do it on-demand when PRIV_VMA_LOCKED
is seen.

Signed-off-by: Val Packett <val@...isiblethingslab.com>
---
Changes from RFC v1[1]:
* An actually working patch now, not just a request for help :)
  All I needed to know was that PV actually doesn't emulate a common
  physical address space *at all*, the pfn mapped by xen_remap_domain_mfn_array
  was *only* available in the userspace process and the *only* way to have
  kernel access to the same memory was to perfore the same call but for the
  kernel's mm.
* Everything happens lazily / on-demand, inside of the ioeventfd code,
  addressing concerns about extra work performed for non-ioeventfd usage
  from the review. (kmalloc specifically is gone entirely..)

I'm leaving this as RFC mostly because of the "fake" vm_area_struct that's used
as a workaround for xen_remap_domain_mfn_array (or rather, its underlying
xen_remap_pfn) accepting a vm_area_struct only to take its vm_mm (and also check
flags with a BUG assertion). It was written for userspace processes since that
was the only use case anyone could ever imagine.. until ioeventfd came along.

So it would probably be better to change xen_remap_pfn to take the mm? Maybe?
I wanted to ask for advice first before trying to refactor other code.

Thanks,
~val

[1]: https://lore.kernel.org/all/20251015195713.6500-1-val@invisiblethingslab.com/
---
 drivers/xen/privcmd.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index f52a457b302d..a3ad10f149ec 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -818,6 +818,8 @@ static long privcmd_ioctl_mmap_resource(struct file *file,
 			DOMID_SELF : kdata.dom;
 		int num, *errs = (int *)pfns;
 
+		vma->vm_pgoff = pfns[0]; /* store the acquired pfn for ioeventfd access */
+
 		BUILD_BUG_ON(sizeof(*errs) > sizeof(*pfns));
 		num = xen_remap_domain_mfn_array(vma,
 						 kdata.addr & PAGE_MASK,
@@ -1248,10 +1250,38 @@ struct privcmd_kernel_ioreq *alloc_ioreq(struct privcmd_ioeventfd *ioeventfd)
 		goto error_kfree;
 	}
 
-	pages = vma->vm_private_data;
-	kioreq->ioreq = (struct ioreq *)(page_to_virt(pages[0]));
 	mmap_write_unlock(mm);
 
+	/* In a PV domain, we must manually map the pages into the kernel */
+	if (vma->vm_private_data == PRIV_VMA_LOCKED) {
+		/* This should never ever happen outside of PV */
+		if (WARN_ON_ONCE(!xen_pv_domain())) {
+			ret = -EINVAL;
+			goto error_kfree;
+		}
+
+		/* xen_remap_domain_mfn_array only really needs the mm */
+		struct vm_area_struct kern_vma = {
+			.vm_flags = VM_PFNMAP | VM_IO,
+			.vm_mm = &init_mm,
+		};
+		xen_pfn_t pfn = vma->vm_pgoff;
+		int num, err;
+
+		/* Don't provide NULL as the errors array as that results in pfn increment */
+		num = xen_remap_domain_mfn_array(&kern_vma, (unsigned long)pfn_to_kaddr(pfn),
+						&pfn, 1, &err, PAGE_KERNEL, ioeventfd->dom);
+		if (num < 0) {
+			ret = num;
+			goto error_kfree;
+		}
+
+		kioreq->ioreq = (struct ioreq *)(pfn_to_kaddr(pfn));
+	} else {
+		pages = vma->vm_private_data;
+		kioreq->ioreq = (struct ioreq *)(page_to_virt(pages[0]));
+	}
+
 	ports = memdup_array_user(u64_to_user_ptr(ioeventfd->ports),
 				  kioreq->vcpus, sizeof(*ports));
 	if (IS_ERR(ports)) {
-- 
2.51.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ