[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0iAJN4eTdp9S=CKbMnVn78R7UnBKbLjBTdRhHebE0i7dA@mail.gmail.com>
Date: Tue, 25 Nov 2025 20:46:36 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Bert Karwatzki <spasswolf@....de>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Christian König <christian.koenig@....com>,
"Mario Limonciello (AMD) (kernel.org)" <superm1@...nel.org>, linux-kernel@...r.kernel.org,
linux-next@...r.kernel.org, regressions@...ts.linux.dev,
linux-pci@...r.kernel.org, linux-acpi@...r.kernel.org,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>, acpica-devel@...ts.linux.dev,
Robert Moore <robert.moore@...el.com>, Saket Dumbre <saket.dumbre@...el.com>
Subject: Re: Crash during resume of pcie bridge due to infinite loop in ACPICA
On Mon, Nov 24, 2025 at 11:34 PM Bert Karwatzki <spasswolf@....de> wrote:
>
> Am Montag, dem 17.11.2025 um 17:40 +0100 schrieb Rafael J. Wysocki:
> >
> > Well, what you have found appears to be an issue in the AML bytecode
> > interpreter which may be one of two things: (1) a bug in the
> > interpreter itself or (2) a bytecode issue that causes the interpreter
> > to crash (eventually) and the latter is quite a bit more likely.
> >
> > I'd suggest opening a new issue at
> > https://github.com/acpica/acpica/issues and attaching the acpidump
> > output from the affected system, to start with.
>
> I've reported the bug to ACPICA github:
> https://github.com/acpica/acpica/issues/1060
I've seen your report, thanks for filing it.
> There's no "infinite" loop, but a loop running for 5051 (0x13BB) iteration until its timeout
> counter reaches Zero (most likely because the hardware is unresponsive). Soon (only a
> handfull of iterations in the walk loop in acpi_ps_parse_aml()) the crash happens. I think
> the crash actually occurs inside acpi_ps_parse_loop(), so I wouldn't rule out an interpreter
> bug just yet.
> The crash also always happens (if it happens ...) in the 30592th iteration of the walk loop,
> so I'm now monitoring the internal of acpi_ps_parse_loop() only in this iteration of the walk
> loop. (I've tried to monitor the parse loop before, but that only led to excessive memory
> consumption and an activated OOM killer). The debugging code can be found here:
> https://gitlab.freedesktop.org/spasswolf/linux-stable/-/commits/amdgpu_suspend_resume?ref_type=heads
>
> So far I've had no crash with this.
What may be happening, but this is just a theory, is that the
interpreter aborts the evaluation of a method due to an internal
timeout, essentially the control_state->control.loop_timeout check in
acpi_ds_exec_end_control_op() and that leads to a subsequent hard
failure like a deadlock.
This may be tested by increasing the ACPI_MAX_LOOP_TIMEOUT value, but
I'm not sure it's practical to try that.
Powered by blists - more mailing lists