lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aYaYnMYik3SC45bb@templeofstupid.com>
Date: Fri, 6 Feb 2026 17:42:52 -0800
From: Krister Johansen <kjlx@...pleofstupid.com>
To: Matthew Ruffell <matthew.ruffell@...onical.com>,
	Michael Kelley <mhklinux@...look.com>
Cc: "DECUI@...rosoft.com" <DECUI@...rosoft.com>,
	"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
	"haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
	"jakeo@...rosoft.com" <jakeo@...rosoft.com>,
	"kwilczynski@...nel.org" <kwilczynski@...nel.org>,
	"kys@...rosoft.com" <kys@...rosoft.com>,
	"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"longli@...rosoft.com" <longli@...rosoft.com>,
	"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
	"mani@...nel.org" <mani@...nel.org>,
	"robh@...nel.org" <robh@...nel.org>,
	"stable@...r.kernel.org" <stable@...r.kernel.org>,
	"wei.liu@...nel.org" <wei.liu@...nel.org>
Subject: Re: [PATCH] PCI: hv: Allocate MMIO from above 4GB for the config
 window

Hi Matthew and Michael,

On Fri, Jan 23, 2026 at 06:39:24AM +0000, Michael Kelley wrote:
> From: Matthew Ruffell <matthew.ruffell@...onical.com> Sent: Thursday, January 22, 2026 9:39 PM
> > > > There's a parameter to the kexec() command that governs whether it uses the
> > > > kexec_file_load() system call or the kexec_load() system call.
> > > > I wonder if that parameter makes a difference in the problem described for this
> > > > patch.
> > 
> > Yes, it does indeed make a difference. I have been debugging this the past few
> > days, and my colleague Melissa noticed that the problem reproduces when secure
> > boot is disabled, but it does not reproduce when secure boot is enabled.
> > Additionally, it reproduces on jammy, but not noble. It turns out that
> > kexec-tools on jammy defaults to kexec_load() when secure boot is disabled,
> > and when enabled, it instead uses kexec_file_load(). On noble, it defaults to
> > first trying kexec_file_load() before falling back to kexec_load(), so the
> > issue does not reproduce.
> 
> This is good info, and definitely a clue. So to be clear, the problem repros
> only when kexec_load() is used. With kexec_file_load(), it does not repro. Is that
> right? I saw a similar distinction when working on commit 304386373007,
> though in the opposite direction!

Just to muddy the waters here, I have a team on the Noble 6.8 kernel
train that's running into this issue on Standard_D#pds_v6 with secure
boot disabled. I've validated via strace(8) that kexec(8) is calling
kexec_file_load(2), but in this case the problem Dexuan describes in the
cover letter occurs but affects NIC attachment instead of the NVMe
storage device. (e.g. pci_hyperv attach of the NIC reports the
pass-through error instead of successfully attaching).


> > > > >  	/*
> > > > >  	 * Set up a region of MMIO space to use for accessing configuration
> > > > > -	 * space.
> > > > > +	 * space. Use the high MMIO range to not conflict with the hyperv_drm
> > > > > +	 * driver (which normally gets MMIO from the low MMIO range) in the
> > > > > +	 * kdump kernel of a Gen2 VM, which fails to reserve the framebuffer
> > > > > +	 * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being
> > > > > +	 * zero in the kdump kernel.
> > > > >  	 */
> > > > > -	ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1,
> > > > > +	ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1,
> > > > >  				  PCI_CONFIG_MMIO_LENGTH, 0x1000, false);
> > > > >  	if (ret)
> > > > >  		return ret;
> > > > > --
> > 
> > Thank you for the patch Dexuan.
> > 
> > This patch fixes the problem on Ubuntu 5.15, and 6.8 based kernels
> > booting V6 instance types on Azure with Gen 2 images.
> 
> Are you seeing the problem on x86/64 or arm64 instances in Azure?
> "V6 instance types" could be either, I think, but I'm guessing you
> are on x86/64.
> 
> And just to confirm: are you seeing the problem with the
> Hyper-V DRM driver, or the Hyper-V FB driver? This patch mentions
> the DRM driver, so I assume that's the problematic config.

It's been arm64 and not x86 for the case I've seen.  They're currently
running with the hyperv_drm driver, but they've also tried swapping to
the fb driver without any change in results.

> > Tested-by: Matthew Ruffell <matthew.ruffell@...onical.com>

All of the above said, I also tested Dexuan's fix on these instances and
found that with the patch applied kdump did work again.

Tested-by: Krister Johansen <johansen@...pleofstupid.com>

-K

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ