[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c651be7-eac5-4c9e-a209-6db3a06c3d2e@gmail.com>
Date: Tue, 4 Nov 2025 18:05:04 -0500
From: Demi Marie Obenour <demiobenour@...il.com>
To: Jürgen Groß <jgross@...e.com>,
Val Packett <val@...isiblethingslab.com>,
Stefano Stabellini <sstabellini@...nel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
Cc: xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
Qubes Developer Mailing List <qubes-devel@...glegroups.com>
Subject: Re: [RFC PATCH] xen: privcmd: fix ioeventfd/ioreq crashing PV domain
On 11/4/25 07:15, Jürgen Groß wrote:
> On 15.10.25 21:57, Val Packett wrote:
>> Starting a virtio backend in a PV domain would panic the kernel in
>> alloc_ioreq, trying to dereference vma->vm_private_data as a pages
>> pointer when in reality it stayed as PRIV_VMA_LOCKED.
>>
>> Fix by allocating a pages array in mmap_resource in the PV case,
>> filling it with page info converted from the pfn array. This allows
>> ioreq to function successfully with a backend provided by a PV dom0.
>>
>> Signed-off-by: Val Packett <val@...isiblethingslab.com>
>> ---
>> I've been porting the xen-vhost-frontend[1] to Qubes, which runs on amd64
>> and we (still) use PV for dom0. The x86 part didn't give me much trouble,
>> but the first thing I found was this crash due to using a PV domain to host
>> the backend. alloc_ioreq was dereferencing the '1' constant and panicking
>> the dom0 kernel.
>>
>> I figured out that I can make a pages array in the expected format from the
>> pfn array where the actual memory mapping happens for the PV case, and with
>> the fix, the ioreq part works: the vhost frontend replies to the probing
>> sequence and the guest recognizes which virtio device is being provided.
>>
>> I still have another thing to debug: the MMIO accesses from the inner driver
>> (e.g. virtio_rng) don't get through to the vhost provider (ioeventfd does
>> not get notified), and manually kicking the eventfd from the frontend
>> seems to crash... Xen itself?? (no Linux panic on console, just a freeze and
>> quick reboot - will try to set up a serial console now)
>
> IMHO for making the MMIO accesses work you'd need to implement ioreq-server
> support for PV-domains in the hypervisor. This will be a major endeavor, so
> before taking your Linux kernel patch I'd like to see this covered.
Would fixing this be a good use of time, or would it be better to
focus on switching to PVH dom0? I don't know if it makes sense to
spend effort on PV dom0 when dom0 isn't going to be PV indefinitely.
Edera might well be interested in the PV case, as they run in cloud
VMs without nested virtualization. That's not relevant to Qubes
OS, though.
>> But I figured I'd post this as an RFC already, since the other bug may be
>> unrelated and the ioreq area itself does work now. I'd like to hear some
>> feedback on this from people who actually know Xen :)
>
> My main problem with your patch is that it is adding a memory allocation
> for a very rare use case impacting all current users of that functionality.
>
> You could avoid that by using a different ioctl which could be selected by
> specifying a new flag when calling xenforeignmemory_open() (have a look
> into the Xen sources under tools/libs/foreignmemory/core.c).
Should there at least be a check to prevent the kernel from crashing?
I'd expect an unsupported use of the API to return an error, not
cause the kernel to oops.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Download attachment "OpenPGP_0xB288B55FFF9C22C1.asc" of type "application/pgp-keys" (7141 bytes)
Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists