lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251202172837.GA3078292@bhelgaas>
Date: Tue, 2 Dec 2025 11:28:37 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: René Rebe <rene@...ctco.de>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
	Riccardo Mottola <riccardo.mottola@...ero.it>,
	Manivannan Sadhasivam <mani@...nel.org>,
	Brian Norris <briannorris@...omium.org>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Lukas Wunner <lukas@...ner.de>,
	Mario Limonciello <mario.limonciello@....com>
Subject: Re: [PATCH] PCI: Fix PCI bridges not to go to D3Hot on older RISC
 systems

[+cc Mani, Brian (a5fb3ff63287 authors), Rafael, Lukas, Mario]

On Tue, Dec 02, 2025 at 05:40:07PM +0100, René Rebe wrote:
> Commit a5fb3ff63287 ("PCI: Allow PCI bridges to go to D3Hot on all
> non-x86") was bisected to break various non-x86 RISC Unix systems,
> e.g. sparc64, see two example oopses below. Fix by only allowing D3Hot
> on modern ARM64, PPC64 and RISCV ISAs besides new enough x86.

I think we need some kind of analysis of what is happening to the PCI
devices here.  I don't know why the CPU architecture per se would be
related to PCI power management.

pci_bridge_d3_possible() is already a barely maintainable hodge podge
of random things that work and don't work.  Generally speaking most of
those cases relate to firmware.  

> Sun Blade 1000:
> ERROR(0): Cheetah error trap taken afsr[0010080005000000] afar[000007f900800000] TL1(0)
> ERROR(0): TPC[100a05a4] TNPC[100a05a8] O7[42acc8] TSTATE[4411001603]
> ERROR(0):
> TPC<MakeIocReady+0xc/0x278 [mptbase]>
> ERROR(0): M_SYND(0),  E_SYND(0), Privileged
> ERROR(0): Highest priority error (0000080000000000) "Bus error response from system bus"
> ERROR(0): D-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000]
> ERROR(0): D-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[0000000000000000]
> ERROR(0): I-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000] u[0000000000000000] l[0000000000000000]
> ERROR(0): I-cache INSN0[0000000000000000] INSN1[0000000000000000] INSN2[0000000000000000] INSN3[0000000000000000]
> ERROR(0): I-cache INSN4[0000000000000000] INSN5[0000000000000000] INSN6[0000000000000000] INSN7[0000000000000000]
> ERROR(0): E-cache idx[b08040] tag[000000001e008fa0]
> ERROR(0): E-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[ffffffffffffffff]
> Kernel panic - not syncing: Irrecoverable deferred error trap.
> CPU: 0 UID: 0 PID: 46 Comm: (udev-worker) Not tainted 6.14.0-rc1-00001-ga5fb3ff63287 #18
> Call Trace:
> [<00000000004294b0>] panic+0xf0/0x370
> [<0000000000435bc4>] cheetah_deferred_handler+0x2c8/0x2d8
> [<0000000000405e88>] c_deferred+0x18/0x24
> [<00000000100a05a4>] MakeIocReady+0xc/0x278 [mptbase]

I assume both of these crashes are related to the
CHIPREG_READ32(&ioc->chip->Doorbell) in mpt_GetIocState(), e.g., maybe
that PCI read failed because an upstream bridge was not in D0 and
therefore treated the read as an unsupported request.

> [<00000000100a089c>] mpt_do_ioc_recovery+0x8c/0x1054 [mptbase]
> [<000000001009f2d4>] mpt_attach+0x920/0xa68 [mptbase]
> [<000000001012424c>] mptsas_probe+0x8/0x3e8 [mptsas]
> [<0000000000788308>] local_pci_probe+0x24/0x70
> [<0000000000788dac>] pci_device_probe+0x1c0/0x1d0
> [<000000000082633c>] really_probe+0x13c/0x29c
> [<0000000000826590>] __driver_probe_device+0xf4/0x104
> [<0000000000826614>] driver_probe_device+0x24/0xa0
> [<000000000082683c>] __driver_attach+0xe8/0x104
> [<0000000000824da0>] bus_for_each_dev+0x58/0x84
> [<0000000000825508>] bus_add_driver+0xdc/0x1f8
> [<0000000000827110>] driver_register+0x70/0x120
> 
> Niagara T1:
> mptsas 0000:07:00.0: Unable to change power state from D3cold to D0, device inaccessible
> NON-RESUMABLE ERROR: Reporting on cpu 31
> NON-RESUMABLE ERROR: TPC [0x0000000010184034] <MakeIocReady+0x10/0x298 [mptbase]>
> NON-RESUMABLE ERROR: RAW [1f10000000000007:0000000e3179235c:0000000202000004:000000ea00300000
> NON-RESUMABLE ERROR: 00000000001f0000:0000000000000000:0000000000000000:0000000000000000]
> NON-RESUMABLE ERROR: handle [0x1f10000000000007] stick [0x0000000e3179235c]
> NON-RESUMABLE ERROR: type [precise nonresumable]
> NON-RESUMABLE ERROR: attrs [0x02000004] < PIO sp-faulted priv >
> NON-RESUMABLE ERROR: raddr [0x000000ea00300000]
> Kernel panic - not syncing: Non-resumable error.
> CPU: 31 UID: 0 PID: 367 Comm: (udev-worker) Not tainted 6.16.12+3-sparc64-smp #1 NONE  Debian 6.16.12-2+sparc64.1
> Call Trace:
> [<00000000004373c4>] dump_stack+0x8/0x18
> [<0000000000429540>] panic+0xf4/0x398
> [<000000000043afcc>] sun4v_nonresum_error+0x16c/0x240
> [<0000000000406eb8>] sun4v_nonres_mondo+0xc8/0xd8
> [<0000000010184034>] MakeIocReady+0x10/0x298 [mptbase]
> [<00000000101844b4>] mpt_do_ioc_recovery+0x9c/0x1110 [mptbase]
> [<00000000101836f8>] mpt_attach+0xb58/0xd20 [mptbase]
> [<0000000010287f30>] mptsas_probe+0x10/0x440 [mptsas]
> [<0000000000b3fab0>] local_pci_probe+0x30/0x80
> [<0000000000b405d4>] pci_device_probe+0xb4/0x240
> [<0000000000bfd348>] really_probe+0xc8/0x400
> [<0000000000bfd70c>] __driver_probe_device+0x8c/0x160
> [<0000000000bfd8c8>] driver_probe_device+0x28/0x100
> [<0000000000bfdb7c>] __driver_attach+0xbc/0x1e0
> [<0000000000bfacfc>] bus_for_each_dev+0x5c/0xc0
> [<0000000000bfcafc>] driver_attach+0x1c/0x40
> Press Stop-A (L1-A) from sun keyboard or send break
> twice on console to return to the boot prom
> 
> Fixes: a5fb3ff63287 ("PCI: Allow PCI bridges to go to D3Hot on all non-x86")
> Signed-off-by: René Rebe <rene@...ctco.de>
> ---
> Tested on Sun Blade 1000, and shipping in all T2/Linux builds since 2025-08-01
> ---
>  drivers/pci/pci.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b14dd064006c..7619d2cfa66d 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3033,9 +3033,9 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge)
>  
>  		/*
>  		 * Out of caution, we only allow PCIe ports from 2015 or newer
> -		 * into D3 on x86.
> +		 * into D3 or other modern ISAs only.
>  		 */
> -		if (!IS_ENABLED(CONFIG_X86) || dmi_get_bios_year() >= 2015)
> +		if (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_PPC64) || IS_ENABLED(CONFIG_RISCV) || dmi_get_bios_year() >= 2015)
>  			return true;
>  		break;
>  	}
> -- 
> 2.52.0
> 
> -- 
> René Rebe, ExactCODE GmbH, Berlin, Germany
> https://exactco.dehttps://t2linux.comhttps://patreon.com/renerebe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ