lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251121135407.53372-1-ni_liqiang@126.com>
Date: Fri, 21 Nov 2025 21:54:07 +0800
From: niliqiang <ni_liqiang@....com>
To: sunilvl@...tanamicro.com
Cc: ajones@...tanamicro.com,
	anup@...infault.org,
	apatel@...tanamicro.com,
	atishp@...shpatra.org,
	bjorn@...nel.org,
	conor+dt@...nel.org,
	deng.weixian@....com.cn,
	devicetree@...r.kernel.org,
	frowand.list@...il.com,
	hu.yuye@....com.cn,
	krzysztof.kozlowski+dt@...aro.org,
	linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org,
	linux-riscv@...ts.infradead.org,
	maz@...nel.org,
	ni.liqiang@....com.cn,
	palmer@...belt.com,
	paul.walmsley@...ive.com,
	robh+dt@...nel.org,
	saravanak@...gle.com,
	tglx@...utronix.de,
	dai.hualiang@....com.cn,
	liu.qingtao2@....com.cn,
	guo.chang2@....com.cn,
	wu.jiabao@....com.cn,
	liu.wenhong35@....com.cn
Subject: Re: [PATCH v16 6/9] irqchip: Add RISC-V advanced PLIC driver for direct-mode

Dear Sunil,

> > > diff --git a/drivers/irqchip/irq-riscv-aplic-main.c b/drivers/irqchip/irq-riscv-aplic-main.c
> > > +static const struct of_device_id aplic_match[] = {
> > > + { .compatible = "riscv,aplic" },
> > > + {}
> > > +};
> > > +
> > > +static struct platform_driver aplic_driver = {
> > > + .driver = {
> > > +  .name  = "riscv-aplic",
> > > +  .of_match_table = aplic_match,
> > > + },
> > > + .probe = aplic_probe,
> > > +};
> > > +builtin_platform_driver(aplic_driver);
> > 
> > Dear Anup Patel and all concerned,
> > 
> > I am writing to inquire about the historical rationale behind defining the APLIC driver's
> > initialization priority using builtin_platform_driver in the current implementation.
> > 
> > In our environment, we are encountering an issue where this priority level causes ACPI-based PCIe
> > enumeration to be executed in the system_unbound_wq work queue. This parallel execution model
> > results in PCIe devices being enumerated in an arbitrary order rather than strictly following the
> > sequence defined in the ACPI DSDT table.
> > 
> > The random enumeration order is adversely affecting customer experience, particularly in scenarios
> > where device ordering is critical for proper system operation or application compatibility.
> > 
> > We are considering modifying the APLIC driver's initialization priority to ensure PCIe enumeration
> > occurs sequentially according to the DSDT specification. However, before proceeding with such
> > changes, we wanted to consult with you regarding:
> > 
> > 1. Were there specific technical considerations that led to the current priority selection?
> > 2. Are there any potential side effects or broader impacts that we might have overlooked?
> > 3. Would you support such a priority adjustment, or do you have alternative suggestions to 
> > address the enumeration order issue?
> > 
> > We greatly appreciate your insights and expertise on this matter, as it will help us make an
> > informed decision while maintaining system stability and compatibility.
> > 
> > Thank you for your time and consideration.
> > 
> 
> IRQ subsystem maintainers rejected the idea of relying on initcalls to
> enforce probe order because initcalls do not guarantee ordering. The
> Linux driver model instead ensures probe order through device
> dependencies. Since PCI INTx depends on the APLIC being probed first,
> the PCI host bridge probe cannot occur until after the APLIC probe
> completes. This requirement and behavior are the same for both DT and
> ACPI. In DT, the driver model uses fw_devlink to establish probe
> ordering, while in ACPI this is handled through either an explicit _DEP
> or, on RISC-V, the GSI mapping.
> Typically, this dependency appears in the DSDT only for the PCI host
> bridge. Individual PCIe devices are enumerated through the standard PCI
> scan once the host bridge has been probed. Therefore, I’m not sure what
> you meant by a probe sequence defined in the DSDT for PCIe devices.
> Regards,
> Sunil

I understand the scenario you described with a single PCI host bridge, where devices are enumerated
through standard PCIe scanning after the host bridge completes probing. However, in ARM and RISC-V
architectures, systems often have multiple PCI host bridges. We're currently facing an issue in a
6-host-bridge system where all bridges depend on the APLIC driver. They must wait until the APLIC
driver completes and callsacpi_dev_clear_dependenciesto resolve dependencies, after which they're
sequentially added to the system_unbound_wq work queue. However, during execution in the work queue,
these 6 host bridges undergo parallel enumeration, preventing them from following the order defined
in the firmware's ACPI DSDT table. Specifically:
1. The ACPI DSDT table declares the 6 host bridges in a fixed sequence
Device(PC06) {
    Name(_HID, "PNP0A08")
    Name(_CID, "PNP0A03")
    Name(_UID, 0x6)
   ......
}
Device(PC07) {
    Name(_HID, "PNP0A08")
    Name(_CID, "PNP0A03")
    Name(_UID, 0x7)
    ......
}
Device(PC08) {
    Name(_HID, "PNP0A08")
    Name(_CID, "PNP0A03")
    Name(_UID, 0x8)
    .....
}
...
Device(PC11) {
    Name(_HID, "PNP0A08")
    Name(_CID, "PNP0A03")
    Name(_UID, 0xB)
......
}

2. But the OS enumerates them in random order upon each boot (first boot sequence ≠ second boot sequence)
first boot sequence ~ # dmesg |grep -i "PCI Root"
[ 8794.588531] ACPI: PCI Root Bridge [PC08] (domain 0006 [bus 80-ff])
[ 8794.624478] ACPI: PCI Root Bridge [PC06] (domain 0005 [bus 00-ff])
[ 8794.672741] ACPI: PCI Root Bridge [PC10] (domain 0008 [bus 00-ff])
[ 8794.696680] ACPI: PCI Root Bridge [PC07] (domain 0006 [bus 00-7f])
[ 8794.728234] ACPI: PCI Root Bridge [PC11] (domain 0009 [bus 00-ff])
[ 8794.755098] ACPI: PCI Root Bridge [PC09] (domain 0007 [bus 00-ff])
second boot sequence ~ # dmesg |grep -i "PCI Root"
[ 8794.588531] ACPI: PCI Root Bridge [PC09] (domain 0007 [bus 00-ff])
[ 8794.624478] ACPI: PCI Root Bridge [PC06] (domain 0005 [bus 00-ff])
[ 8794.672741] ACPI: PCI Root Bridge [PC08] (domain 0006 [bus 80-ff])
[ 8794.696680] ACPI: PCI Root Bridge [PC11] (domain 0009 [bus 00-ff])
[ 8794.728234] ACPI: PCI Root Bridge [PC07] (domain 0006 [bus 00-7f])
[ 8794.755098] ACPI: PCI Root Bridge [PC10] (domain 0008 [bus 00-ff])

This creates a critical issue: when NVMe devices are connected to these host bridges, the
unpredictable kernel scanning sequence causes device identifiers (e.g., /dev/nvme0n1, /dev/nvme1n1)
to change across reboots. In server environments, such device naming instability is unacceptable as
it breaks storage configuration reliability and consistency.

So far, we've only observed this disorderly enumeration in RISC-V multi-host-bridge scenarios, where
APLIC dependency leads to enumeration via system_unbound_wq. We'd like to consult kernel experts:
1. Has the impact of enumeration disorder in multi-host-bridge scenarios been considered?
2. Are there viable solutions to address the random enumeration caused by system_unbound_wq?

Best regards,
Liqiang


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ