lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <trdjd7zhpldyeurmpvx4zpgjoz7hmf3ugayybz4gagu2iue56c@zswmzvauqnxk>
Date: Mon, 13 Oct 2025 16:46:18 +0800
From: Inochi Amaoto <inochiama@...il.com>
To: Genes Lists <lists@...ience.com>, Jens Axboe <axboe@...nel.dk>, 
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org
Cc: linux-pci@...r.kernel.org
Subject: Re: mainline boot fail nvme/block? [BISECTED]

On Fri, Oct 10, 2025 at 07:49:34PM -0400, Genes Lists wrote:
> On Fri, 2025-10-10 at 08:54 -0600, Jens Axboe wrote:
> > On 10/10/25 8:29 AM, Genes Lists wrote:
> > > Mainline fails to boot - 6.17.1 works fine.
> > > Same kernel on an older laptop without any nvme works just fine.
> > > 
> > > It seems to get stuck enumerating disks within the initramfs
> > > created by
> > > dracut.
> > > 
> > > ,,,
> > > 
> > > Machine is dell xps 9320 laptop (firmware 2.23.0) with nvme
> > > partitioned:
> > > 
> > >     # lsblk -f
> > >     NAME        FSTYPE      FSVER LABEL FSAVAIL FSUSE%
> > > MOUNTPOINTS    
> > >     sr0
> > >     nvme0n1
> > >     ├─nvme0n1p1 vfat        FAT32 ESP   2.6G    12% /boot
> > >     ├─nvme0n1p2 ext4        1.0   root  77.7G    42% / 
> > >     └─nvme0n1p3 crypto_LUKS 2                          
> > >       └─home    btrfs             home  1.3T    26% /opt
> > >                                                    
> > > /home             
> > > 
> > > 
> > > 
> > > Will try do bisect over the weekend.
> > 
> > That'd be great, because there's really not much to glean from this
> > bug
> > report.
> 
> Bisect landed here. (cc linux-pci@...r.kernel.org)
> Hopefully it is helpful, even though I don't see MSI in lspci output
> (which is provided below).
> 
> gene
> 
> 
> 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b is the first bad commit
> commit 54f45a30c0d0153d2be091ba2d683ab6db6d1d5b (HEAD)
> Author: Inochi Amaoto <inochiama@...il.com>
> Date:   Thu Aug 14 07:28:32 2025 +0800
> 
>     PCI/MSI: Add startup/shutdown for per device domains
> 
>     As the RISC-V PLIC cannot apply affinity settings without invoking
>     irq_enable(), it will make the interrupt unavailble when used as an
>     underlying interrupt chip for the MSI controller.
> 
>     Implement the irq_startup() and irq_shutdown() callbacks for the
> PCI MSI
>     and MSI-X templates.
> 
>     For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT, the parent
> startup
>     and shutdown functions are invoked. That allows the interrupt on
> the parent
>     chip to be enabled if the interrupt has not been enabled during
>     allocation. This is necessary for MSI controllers which use PLIC as
>     underlying parent interrupt chip.
> 
>     Suggested-by: Thomas Gleixner <tglx@...utronix.de>
>     Signed-off-by: Inochi Amaoto <inochiama@...il.com>
>     Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>     Tested-by: Chen Wang <unicorn_wang@...look.com> # Pioneerbox
>     Reviewed-by: Chen Wang <unicorn_wang@...look.com>
>     Acked-by: Bjorn Helgaas <bhelgaas@...gle.com>
>     Link: https://lore.kernel.org/all/20250813232835.43458-3-
> inochiama@...il.com
> 
>  drivers/pci/msi/irqdomain.c | 52
> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/msi.h         |  2 ++
>  2 files changed, 54 insertions(+)
> 
> 
> ----------------------------------------- lspci output ----------------
> In case helpful here's lspci output:
> 
> 0000:00:00.0 Host bridge: Intel Corporation Raptor Lake-P/U 4p+8e cores
> Host Bridge/DRAM Controller
> 0000:00:02.0 VGA compatible controller: Intel Corporation Raptor Lake-P
> [Iris Xe Graphics] (rev 04)
> 0000:00:04.0 Signal processing controller: Intel Corporation Raptor
> Lake Dynamic Platform and Thermal Framework Processor Participant
> 0000:00:05.0 Multimedia controller: Intel Corporation Raptor Lake IPU
> 0000:00:06.0 System peripheral: Intel Corporation RST VMD Managed
> Controller
> 0000:00:07.0 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4
> PCI Express Root Port #0
> 0000:00:07.2 PCI bridge: Intel Corporation Raptor Lake-P Thunderbolt 4
> PCI Express Root Port #2
> 0000:00:08.0 System peripheral: Intel Corporation GNA Scoring
> Accelerator module
> 0000:00:0a.0 Signal processing controller: Intel Corporation Raptor
> Lake Crashlog and Telemetry (rev 01)
> 0000:00:0d.0 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 USB Controller
> 0000:00:0d.2 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 NHI #0
> 0000:00:0d.3 USB controller: Intel Corporation Raptor Lake-P
> Thunderbolt 4 NHI #1
> 0000:00:0e.0 RAID bus controller: Intel Corporation Volume Management
> Device NVMe RAID Controller Intel Corporation
> 0000:00:12.0 Serial controller: Intel Corporation Alder Lake-P
> Integrated Sensor Hub (rev 01)
> 0000:00:14.0 USB controller: Intel Corporation Alder Lake PCH USB 3.2
> xHCI Host Controller (rev 01)
> 0000:00:14.2 RAM memory: Intel Corporation Alder Lake PCH Shared SRAM
> (rev 01)
> 0000:00:14.3 Network controller: Intel Corporation Raptor Lake PCH CNVi
> WiFi (rev 01)
> 0000:00:15.0 Serial bus controller: Intel Corporation Alder Lake PCH
> Serial IO I2C Controller #0 (rev 01)
> 0000:00:15.1 Serial bus controller: Intel Corporation Alder Lake PCH
> Serial IO I2C Controller #1 (rev 01)
> 0000:00:16.0 Communication controller: Intel Corporation Alder Lake PCH
> HECI Controller (rev 01)
> 0000:00:1e.0 Communication controller: Intel Corporation Alder Lake PCH
> UART #0 (rev 01)
> 0000:00:1e.3 Serial bus controller: Intel Corporation Alder Lake SPI
> Controller (rev 01)
> 0000:00:1f.0 ISA bridge: Intel Corporation Raptor Lake LPC/eSPI
> Controller (rev 01)
> 0000:00:1f.3 Multimedia audio controller: Intel Corporation Raptor
> Lake-P/U/H cAVS (rev 01)
> 0000:00:1f.4 SMBus: Intel Corporation Alder Lake PCH-P SMBus Host
> Controller (rev 01)
> 0000:00:1f.5 Serial bus controller: Intel Corporation Alder Lake-P PCH
> SPI Controller (rev 01)
> 0000:01:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:00.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:01.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:02.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:03.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 0000:02:04.0 PCI bridge: Intel Corporation Thunderbolt 4 Bridge [Goshen
> Ridge 2020] (rev 02)
> 10000:e0:06.0 PCI bridge: Intel Corporation Raptor Lake PCIe 4.0
> Graphics Port
> 10000:e1:00.0 Non-Volatile memory controller: SK hynix Platinum
> P41/PC801 NVMe Solid State Drive
> 
> 
> -- 
> Gene

I think this is caused by VMD device, which I have a temporary solution
here [1]. Since I have no idea about how VMD works, I hope if anyone
can help to convert this as an formal fix.

[1] https://lore.kernel.org/all/qs2vydzm6xngul77xuwjli7h757gzfhmb4siiklzogihz5oplw@gsvgn75lib6t/

Regards,
Inochi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ