[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <trdjd7zhpldyeurmpvx4zpgjoz7hmf3ugayybz4gagu2iue56c@zswmzvauqnxk>
Date: Mon, 13 Oct 2025 16:46:18 +0800
From: Inochi Amaoto <inochiama@...il.com>
To: Genes Lists <lists@...ience.com>, Jens Axboe <axboe@...nel.dk>,
linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Cc: linux-pci@...r.kernel.org
Subject: Re: mainline boot fail nvme/block? [BISECTED]
On Fri, Oct 10, 2025 at 07:49:34PM -0400, Genes Lists wrote:
> On Fri, 2025-10-10 at 08:54 -0600, Jens Axboe wrote:
> > On 10/10/25 8:29 AM, Genes Lists wrote:
> > > Mainline fails to boot - 6.17.1 works fine.
> > > Same kernel on an older laptop without any nvme works just fine.
> > >
> > > It seems to get stuck enumerating disks within the initramfs
> > > created by
> > > dracut.
> > >
> > > ,,,
> > >
> > > Machine is dell xps 9320 laptop (firmware 2.23.0) with nvme
> > > partitioned:
> > >
> > > # lsblk -f
> > > NAME FSTYPE FSVER LABEL FSAVAIL FSUSE%
> > > MOUNTPOINTS
> > > sr0
> > > nvme0n1
> > > ├─nvme0n1p1 vfat FAT32 ESP 2.6G 12% /boot
> > > ├─nvme0n1p2 ext4 1.0 root 77.7G 42% /
> > > └─nvme0n1p3 crypto_LUKS 2
> > > └─home btrfs home 1.3T 26% /opt
> > >
> > > /home
> > >
> > >
> > >
> > > Will try do bisect over the weekend.
> >
> > That'd be great, because there's really not much to glean from this
> > bug
> > report.
>
> Bisect landed here. (cc linux-pci@...r.kernel.org)
> Hopefully it is helpful, even though I don't see MSI in lspci output
> (which is provided below).
>
> gene
>
>
> 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b is the first bad commit
> commit 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b (HEAD)
> Author: Inochi Amaoto <inochiama@...il.com>
> Date: Thu Aug 14 07:28:32 2025 +0800
>
> PCI/MSI: Add startup/shutdown for per device domains
>
> As the RISC-V PLIC cannot apply affinity settings without invoking
> irq_enable(), it will make the interrupt unavailble when used as an
> underlying interrupt chip for the MSI controller.
>
> Implement the irq_startup() and irq_shutdown() callbacks for the
> PCI MSI
> and MSI-X templates.
>
> For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT, the parent
> startup
> and shutdown functions are invoked. That allows the interrupt on
> the parent
> chip to be enabled if the interrupt has not been enabled during
> allocation. This is necessary for MSI controllers which use PLIC as
> underlying parent interrupt chip.
>
> Suggested-by: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Inochi Amaoto <inochiama@...il.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
> Tested-by: Chen Wang <unicorn_wang@...look.com> # Pioneerbox
> Reviewed-by: Chen Wang <unicorn_wang@...look.com>
> Acked-by: Bjorn Helgaas <bhelgaas@...gle.com>
> Link: https://lore.kernel.org/all/20250813232835.43458-3-
> inochiama@...il.com
>
> drivers/pci/msi/irqdomain.c | 52
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/msi.h | 2 ++
> 2 files changed, 54 insertions(+)
>
>
> ----------------------------------------- lspci output ----------------
> In case helpful here's lspci output:
>
> 0000:00:00.0 Host bridge: Intel Corporation Raptor Lake-P/U 4p+8e cores
> Host Bridge/DRAM Controller
> 0000:00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P
> [Iris Xe Graphics] (rev 04)
> 0000:00:04.0 Signal processing controller: Intel Corporation Raptor
> Lake Dynamic Platform and Thermal Framework Processor Participant
> 0000:00:05.0 Multimedia controller: Intel Corporation Raptor Lake IPU
> 0000:00:06.0 System peripheral: Intel Corporation RST VMD Managed
> Controller
> 0000:00:07.0 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4
> PCI Express Root Port #0
> 0000:00:07.2 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4
> PCI Express Root Port #2
> 0000:00:08.0 System peripheral: Intel Corporation GNA Scoring
> Accelerator module
> 0000:00:0a.0 Signal processing controller: Intel Corporation Raptor
> Lake Crashlog and Telemetry (rev 01)
> 0000:00:0d.0 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 USB Controller
> 0000:00:0d.2 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 NHI #0
> 0000:00:0d.3 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 NHI #1
> 0000:00:0e.0 RAID bus controller: Intel Corporation Volume Management
> Device NVMe RAID Controller Intel Corporation
> 0000:00:12.0 Serial controller: Intel Corporation Alder Lake-P
> Integrated Sensor Hub (rev 01)
> 0000:00:14.0 USB controller: Intel Corporation Alder Lake PCH USB 3.2
> xHCI Host Controller (rev 01)
> 0000:00:14.2 RAM memory: Intel Corporation Alder Lake PCH Shared SRAM
> (rev 01)
> 0000:00:14.3 Network controller: Intel Corporation Raptor Lake PCH CNVi
> WiFi (rev 01)
> 0000:00:15.0 Serial bus controller: Intel Corporation Alder Lake PCH
> Serial IO I2C Controller #0 (rev 01)
> 0000:00:15.1 Serial bus controller: Intel Corporation Alder Lake PCH
> Serial IO I2C Controller #1 (rev 01)
> 0000:00:16.0 Communication controller: Intel Corporation Alder Lake PCH
> HECI Controller (rev 01)
> 0000:00:1e.0 Communication controller: Intel Corporation Alder Lake PCH
> UART #0 (rev 01)
> 0000:00:1e.3 Serial bus controller: Intel Corporation Alder Lake SPI
> Controller (rev 01)
> 0000:00:1f.0 ISA bridge: Intel Corporation Raptor Lake LPC/eSPI
> Controller (rev 01)
> 0000:00:1f.3 Multimedia audio controller: Intel Corporation Raptor
> Lake-P/U/H cAVS (rev 01)
> 0000:00:1f.4 SMBus: Intel Corporation Alder Lake PCH-P SMBus Host
> Controller (rev 01)
> 0000:00:1f.5 Serial bus controller: Intel Corporation Alder Lake-P PCH
> SPI Controller (rev 01)
> 0000:01:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:01.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:02.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:03.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:04.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 10000:e0:06.0 PCI bridge: Intel Corporation Raptor Lake PCIe 4.0
> Graphics Port
> 10000:e1:00.0 Non-Volatile memory controller: SK hynix Platinum
> P41/PC801 NVMe Solid State Drive
>
>
> --
> Gene
I think this is caused by VMD device, which I have a temporary solution
here [1]. Since I have no idea about how VMD works, I hope if anyone
can help to convert this as an formal fix.
[1] https://lore.kernel.org/all/qs2vydzm6xngul77xuwjli7h757gzfhmb4siiklzogihz5oplw@gsvgn75lib6t/
Regards,
Inochi
Powered by blists - more mailing lists