[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7f73fdfa-2875-4349-9ef6-134e678ac691@invisiblethingslab.com>
Date: Tue, 4 Nov 2025 22:16:44 -0300
From: Val Packett <val@...isiblethingslab.com>
To: Jürgen Groß <jgross@...e.com>,
Stefano Stabellini <sstabellini@...nel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
Cc: xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] xen: privcmd: fix ioeventfd/ioreq crashing PV domain
On 11/4/25 9:15 AM, Jürgen Groß wrote:
> On 15.10.25 21:57, Val Packett wrote:
>> Starting a virtio backend in a PV domain would panic the kernel in
>> alloc_ioreq, trying to dereference vma->vm_private_data as a pages
>> pointer when in reality it stayed as PRIV_VMA_LOCKED.
>>
>> Fix by allocating a pages array in mmap_resource in the PV case,
>> filling it with page info converted from the pfn array. This allows
>> ioreq to function successfully with a backend provided by a PV dom0.
>>
>> Signed-off-by: Val Packett <val@...isiblethingslab.com>
>> ---
>> I've been porting the xen-vhost-frontend[1] to Qubes, which runs on
>> amd64
>> and we (still) use PV for dom0. The x86 part didn't give me much
>> trouble,
>> but the first thing I found was this crash due to using a PV domain
>> to host
>> the backend. alloc_ioreq was dereferencing the '1' constant and
>> panicking
>> the dom0 kernel.
>>
>> I figured out that I can make a pages array in the expected format
>> from the
>> pfn array where the actual memory mapping happens for the PV case,
>> and with
>> the fix, the ioreq part works: the vhost frontend replies to the probing
>> sequence and the guest recognizes which virtio device is being provided.
>>
>> I still have another thing to debug: the MMIO accesses from the inner
>> driver
>> (e.g. virtio_rng) don't get through to the vhost provider (ioeventfd
>> does
>> not get notified), and manually kicking the eventfd from the frontend
>> seems to crash... Xen itself?? (no Linux panic on console, just a
>> freeze and
>> quick reboot - will try to set up a serial console now)
>
> IMHO for making the MMIO accesses work you'd need to implement
> ioreq-server
> support for PV-domains in the hypervisor. This will be a major
> endeavor, so
> before taking your Linux kernel patch I'd like to see this covered.
Sorry, I wasn't clear enough.. it's *not* that MMIO accesses don't work.
I debugged this a bit more, and it turns out:
1. the reason why "ioeventfd does not get notified" is because accessing
the virtio page (allocated with this privcmd interface) from the kernel
was failing. The exchange between the guest driver and the userspace
ioreq server has been working perfectly, but the *kernel* access (which
is what needs this `struct page` allocation with the current code) was
returning nonsense and the check for the virtqueue readiness flag was
failing.
I have noticed and fixed (locally) a bug in this patch: reusing the
`pfns` allocation for `errs` in `xen_remap_domain_mfn_array` meant that
the actual pfn value was overwritten with a zero ("success" error code),
and that's the `pfn` I was using.
Still, the memory visible in the dom0 kernel at that pfn is not the same
allocation that's mapped into the process. Instead, it's some random
other memory. I've added a hexdump for it in the ioeventfd notifier and
it was returning random stuff from other userspace programs such as "//
SPDX-License-Identifier" from a text editor (haha). Actually, *once* it
did just work and I've managed to attach a virtio-rng driver and have it
fully work.
Clearly I'm just struggling with the way memory mappings work under PV.
Do I need to specifically create a second mapping for the kernel using
the same `xen_remap_domain_mfn_array` call?
2. the reason why "manually kicking the eventfd from the frontend seems
to crash... Xen itself" was actually because that triggered the guest
interrupt and I was using the ISA interrupts that required the virtual
(IO)APIC to exist, and it doesn't in PVH domains. For now I switched my
test setup to HVM to get around that, but I'd need to.. figure out a
virq/pirq type setup to route XEN_DMOP_set_isa_irq_level calls over
event channels for PV(H) guests.
>> But I figured I'd post this as an RFC already, since the other bug
>> may be
>> unrelated and the ioreq area itself does work now. I'd like to hear some
>> feedback on this from people who actually know Xen :)
>
> My main problem with your patch is that it is adding a memory allocation
> for a very rare use case impacting all current users of that
> functionality.
>
> You could avoid that by using a different ioctl which could be
> selected by
> specifying a new flag when calling xenforeignmemory_open() (have a look
> into the Xen sources under tools/libs/foreignmemory/core.c).
Right, that could be solved. Having userspace choose based on what kind
of domain it is sounds a bit painful (you're talking about C libraries
and I'm using independent Rust ones, so this logic would have to be
present in multiple places).. But this kernel code could be refactored more.
We don't actually need any `struct page` specifically,
`ioeventfd_interrupt` only really needs a kernel pointer to the actual
ioreq memory we're allocating here.
I'm mostly just asking for help to figure out how to get that pointer.
Thanks,
~val
Download attachment "OpenPGP_0xCF3BB99C6ACDA951.asc" of type "application/pgp-keys" (1267 bytes)
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (274 bytes)
Powered by blists - more mailing lists