[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHP4M8Ud_tm+SPmZtnSi1--zf=MTsbvSqDYdAfPdAXUj+Ormkg@mail.gmail.com>
Date: Sat, 20 Dec 2025 18:22:49 +0530
From: Ajay Garg <ajaygargnsit@...il.com>
To: Alex Williamson <alex@...zbot.org>
Cc: QEMU Developers <qemu-devel@...gnu.org>, iommu@...ts.linux-foundation.org,
linux-pci@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: A lingering doubt on PCI-MMIO region of PCI-passthrough-device
Thanks Alex.
I was/am aware of GPA-ranges backed by mmap'ed HVA-ranges.
On further thought, I think I have all the missing pieces (except one,
as mentioned at last in current email).
I'll list the steps below :
a)
There are three stages :
* pre-configuration by host/qemu.
* guest-vm bios.
* guest-vm kernel.
b)
Host procures following memory-slots (amongst others) via mmap :
* guest-ram
* pci-config-space : via vfio's ioctls' help.
* pci-bar-mmio-space : via vfio's ioctls' help.
For the above memory-slots,
*
guest-ram physical-address is known (0), so ept-mappings for guest-ram
are set up even before guest-vm begins to boot up.
*
there is no concept of guest-physical-address for pci-config-space.
*
pci-bar-mmio-space physical address is not known yet, so ept-mappings
for pci-bar-mmio-space are not set up (yet).
c)
qemu starts the guest, and guest-vm-bios runs next.
This bios is "owned by qemu", and is "definitely different" from the
host-bios (qemu is an altogether different "hardware"). qemu-bios and
host-bios handle pci bus/enumeration "completely differently".
When the pci-enumeration runs during this guest-vm-bios stage, it
accesses the pci-device config-space (backed on the host by mmap'ed
mappings). Note that guest-kernel is still not in picture.
"OBVIOUSLY", all accesses (reads/writes) to pci-config space go to the
pci-config-space memory-slot (handled purely by qemu-bios code).
Once the guest-vm bios carves out guest-physical-addresses for the
pci-device-bars, it programs the bars by writing to bars-offsets in
the pci-config-space. qemu detects this, and does the following :
* does not relay the actual-writes to physical bars on the host.
* since the bar-guest-physical-addresses are now known, so now the
missing ept-mappings
for pci-bar-mmio-space are now set up.
d)
Finally, guest-kernel takes over, and
* all accesses to ram go through vanilla two-stages translation.
* all accesses to pci-bars-mmio go through vanilla two-stages translation.
Requests :
i)
Alex / QEMU-experts : kindly correct me if I am wrong :) till now.
ii)
Once kernel boots up, how are accesses to pci-config-space handled? Is
again qemu-bios involved in pci-config-space accesses after
guest-kernel has booted up?
Once again, many thanks to everyone for their time and help.
Thanks and Regards,
Ajay
On Sat, Dec 20, 2025 at 5:36 AM Alex Williamson <alex@...zbot.org> wrote:
>
> On Fri, 19 Dec 2025 11:53:56 +0530
> Ajay Garg <ajaygargnsit@...il.com> wrote:
>
> > Hi Alex.
> > Kindly help if the steps listed in the previous email are correct.
> >
> > (Have added qemu mailing-list too, as it might be a qemu thing too as
> > virtual-pci is in picture).
> >
> > On Mon, Dec 15, 2025 at 9:20 AM Ajay Garg <ajaygargnsit@...il.com> wrote:
> > >
> > > Thanks Alex.
> > >
> > > So does something like the following happen :
> > >
> > > i)
> > > During bootup, guest starts pci-enumeration as usual.
> > >
> > > ii)
> > > Upon discovering the "passthrough-device", guest carves the physical
> > > MMIO regions (as usual) in the guest's physical-address-space, and
> > > starts-to/attempts to program the BARs with the
> > > guest-physical-base-addresses carved out.
> > >
> > > iii)
> > > These attempts to program the BARs (lying in the
> > > "passthrough-device"'s config-space), are intercepted by the
> > > hypervisor instead (causing a VM-exit in the interim).
> > >
> > > iv)
> > > The hypervisor uses the above info to update the EPT, to ensure GPA =>
> > > HPA conversions go fine when the guest tries to access the PCI-MMIO
> > > regions later (once gurst is fully booted up). Also, the hypervisor
> > > marks the operation as success (without "really" re-programming the
> > > BARs).
> > >
> > > v)
> > > The VM-entry is called, and the guest resumes with the "impression"
> > > that the BARs have been "programmed by guest".
> > >
> > > Is the above sequencing correct at a bird's view level?
>
> It's not far off. The key is simply that we can create a host virtual
> mapping to the device BARs, ie. an mmap. The guest enumerates emulated
> BARs, they're only used for sizing and locating the BARs in the guest
> physical address space. When the guest BAR is programmed and memory
> enabled, the address space in QEMU is populated at the BAR indicated
> GPA using the mmap backing. KVM memory slots are used to fill the
> mappings in the vCPU. The same BAR mmap is also used to provide DMA
> mapping of the BAR through the IOMMU in the legacy type1 IOMMU backend
> case. Barring a vIOMMU, the IOMMU IOVA space is the guest physical
> address space. Thanks,
>
> Alex
Powered by blists - more mailing lists