[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157545DAFDCCE0028439DB2D497A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 22 Jan 2026 20:22:50 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Long Li <longli@...rosoft.com>, Dexuan Cui <DECUI@...rosoft.com>, KY
Srinivasan <kys@...rosoft.com>, Haiyang Zhang <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, "lpieralisi@...nel.org"
<lpieralisi@...nel.org>, "kwilczynski@...nel.org" <kwilczynski@...nel.org>,
"mani@...nel.org" <mani@...nel.org>, "robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>, Jake Oshins
<jakeo@...rosoft.com>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
CC: "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config
window
From: Long Li <longli@...rosoft.com> Sent: Thursday, January 22, 2026 11:14 AM
>
> > From: Dexuan Cui <decui@...rosoft.com> Sent: Wednesday, January 21, 2026 6:04 PM
> > >
> > > There has been a longstanding MMIO conflict between the pci_hyperv
> > > driver's config_window (see hv_allocate_config_window()) and the
> > > hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
> > > both get MMIO from the low MMIO range below 4GB; this is not an issue
> > > in the normal kernel since the VMBus driver reserves the framebuffer
> > > MMIO in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram()
> > > can always get the reserved framebuffer MMIO; however, a Gen2 VM's
> > > kdump kernel fails to reserve the framebuffer MMIO in
> > > vmbus_reserve_fb() because the screen_info.lfb_base is zero in the
> > > kdump kernel: the screen_info is not initialized at all in the kdump
> > > kernel, because the EFI stub code, which initializes screen_info, doesn't run in
> > the case of kdump.
> >
> > I don't think this is correct. Yes, the EFI stub doesn't run, but screen_info should
> > be initialized in the kdump kernel by the code that loads the kdump kernel into
> > the reserved crash memory. See discussion in the commit message for commit
> > 304386373007.
>
> On AMD64 the screen_info is passed through kexec system call. But this is not the case
> for ARM64, it relies on EFI to get screen_info.
Hmmm. So does the problem described here only happen on arm64? If so, that
might be worth noting in the commit message.
I found my notes from working on commit 304386373007. I don't remember
testing on arm64, and my notes don't mention it. So I'm wondering if the problem
fixed by that commit could happen on arm64. That's potentially a separate issue
from this one. I'll do some experiments to verify.
>
> However, Hyper-v guarantees the framebuffer MMIO is below 4GB. So, the patch works
> by allocating PCI MMIO separately from that of the framebuffer.
Yes, that seems right. But there's still something nagging at me about this,
though I can't immediately identify a problem. I'll follow up if something
comes to me. :-)
Michael
>
> Long
>
> >
> > I wonder if commit a41e0ab394e4 broke the initialization of screen_info in the
> > kdump kernel. Or perhaps there is now a rev-lock between the kernel with this
> > commit and a new version of the user space kexec command.
> >
> > There's a parameter to the kexec() command that governs whether it uses the
> > kexec_file_load() system call or the kexec_load() system call.
> > I wonder if that parameter makes a difference in the problem described for this
> > patch.
> >
> > I can't immediately remember if, when I was working on commit 304386373007, I
> > tested kdump in a Gen 2 VM with an NVMe OS disk to ensure that MMIO space
> > was properly allocated to the frame buffer driver (either hyperv_fb or
> > hyperv_drm). I'm thinking I did, but tomorrow I'll check for any definitive notes on
> > that.
> >
> > Michael
> >
> > >
> > > When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the
> > > kdump kernel, if pci_hyperv in the kdump kernel loads before
> > > hyperv_drm loads, pci_hyperv's vmbus_allocate_mmio() gets the
> > > framebuffer MMIO and tries to use it, but since the host thinks that
> > > the MMIO range is still in use by hyperv_drm, the host refuses to
> > > accept the MMIO range as the config window, and pci_hyperv's
> > hv_pci_enter_d0() errors out:
> > > "PCI Pass-through VSP failed D0 Entry with status c0370048".
> > >
> > > This PCI error in the kdump kernel was not fatal in the past because
> > > the kdump kernel normally doesn't reply on pci_hyperv, and the root
> > > file system is on a VMBus SCSI device.
> > >
> > > Now, a VM on Azure can boot from NVMe, i.e. the root FS can be on a
> > > NVMe device, which depends on pci_hyperv. When the PCI error occurs,
> > > the kdump kernel fails to boot up since no root FS is detected.
> > >
> > > Fix the MMIO conflict by allocating MMIO above 4GB for the
> > > config_window.
> > >
> > > Note: we still need to figure out how to address the possible MMIO
> > > conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI
> > > MMIO BARs, but that's of low priority because all PCI devices
> > > available to a Linux VM on Azure should use 64-bit BARs and should not
> > > use 32-bit BARs -- I checked Mellanox VFs, MANA VFs, NVMe devices, and
> > > GPUs in Linux VMs on Azure, and found no 32-bit BARs.
> > >
> > > Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for
> > > Microsoft Hyper-V VMs")
> > > Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> > > Cc: stable@...r.kernel.org
> > > ---
> > > drivers/pci/controller/pci-hyperv.c | 8 ++++++--
> > > 1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/pci-hyperv.c
> > > b/drivers/pci/controller/pci-hyperv.c
> > > index 1e237d3538f9..a6aecb1b5cab 100644
> > > --- a/drivers/pci/controller/pci-hyperv.c
> > > +++ b/drivers/pci/controller/pci-hyperv.c
> > > @@ -3406,9 +3406,13 @@ static int hv_allocate_config_window(struct
> > > hv_pcibus_device *hbus)
> > >
> > > /*
> > > * Set up a region of MMIO space to use for accessing configuration
> > > - * space.
> > > + * space. Use the high MMIO range to not conflict with the hyperv_drm
> > > + * driver (which normally gets MMIO from the low MMIO range) in the
> > > + * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer
> > > + * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being
> > > + * zero in the kdump kernel.
> > > */
> > > - ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1,
> > > + ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1,
> > > PCI_CONFIG_MMIO_LENGTH, 0x1000, false);
> > > if (ret)
> > > return ret;
> > > --
> > > 2.43.0
Powered by blists - more mailing lists