linux-kernel - Re: NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87v7h5ia3d.ffs@tglx>
Date: Tue, 13 Jan 2026 20:30:46 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Bert Karwatzki <spasswolf@....de>, linux-kernel@...r.kernel.org
Cc: linux-next@...r.kernel.org, spasswolf@....de, Mario Limonciello
 <mario.limonciello@....com>, Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>, Clark Williams <clrkwllms@...nel.org>, Steven
 Rostedt <rostedt@...dmis.org>, Christian König
 <christian.koenig@....com>,
 regressions@...ts.linux.dev, linux-pci@...r.kernel.org,
 linux-acpi@...r.kernel.org, "Rafael J . Wysocki"
 <rafael.j.wysocki@...el.com>, acpica-devel@...ts.linux.dev, Robert Moore
 <robert.moore@...el.com>, Saket Dumbre <saket.dumbre@...el.com>, Bjorn
 Helgaas <bhelgaas@...gle.com>, Clemens Ladisch <clemens@...isch.de>,
 Jinchao Wang <wangjinchao600@...il.com>, Yury Norov
 <yury.norov@...il.com>, Anna Schumaker <anna.schumaker@...cle.com>,
 Baoquan He <bhe@...hat.com>, "Darrick J. Wong" <djwong@...nel.org>, Dave
 Young <dyoung@...hat.com>, Doug Anderson <dianders@...omium.org>,
 "Guilherme G. Piccoli" <gpiccoli@...lia.com>, Helge Deller
 <deller@....de>, Ingo Molnar <mingo@...nel.org>, Jason Gunthorpe
 <jgg@...pe.ca>, Joanthan
 Cameron <Jonathan.Cameron@...wei.com>, Joel Granados
 <joel.granados@...nel.org>, John Ogness <john.ogness@...utronix.de>, Kees
 Cook <kees@...nel.org>, Li Huafei <lihuafei1@...wei.com>, "Luck, Tony"
 <tony.luck@...el.com>, Luo Gengkun <luogengkun@...weicloud.com>, Max
 Kellermann <max.kellermann@...os.com>, Nam Cao <namcao@...utronix.de>,
 oushixiong <oushixiong@...inos.cn>, Petr Mladek <pmladek@...e.com>,
 Qianqiang Liu <qianqiang.liu@....com>, Sergey Senozhatsky
 <senozhatsky@...omium.org>, Sohil Mehta <sohil.mehta@...el.com>, Tejun Heo
 <tj@...nel.org>, Thomas Zimemrmann <tzimmermann@...e.de>, Thorsten Blum
 <thorsten.blum@...ux.dev>, Ville Syrjala <ville.syrjala@...ux.intel.com>,
 Vivek Goyal <vgoyal@...hat.com>, Yunhui Cui <cuiyunhui@...edance.com>,
 Andrew Morton <akpm@...ux-foundation.org>, W_Armin@....de
Subject: Re: NMI stack overflow during resume of PCIe bridge with
 CONFIG_HARDLOCKUP_DETECTOR=y

On Tue, Jan 13 2026 at 18:50, Bert Karwatzki wrote:
> Am Dienstag, dem 13.01.2026 um 16:24 +0100 schrieb Thomas Gleixner:
>
>>  What's more likely is that after a while _ALL_ CPUs are hung up in
>> the NMI handler after they tripped over the HPET read.
>
> I'm not sure about that, my latest testrun (with v6.18) crashed with
> only one message from exc_nmi().

What means crashed? Did it actually crash and output something or does
the machine just go dead? I assume the latter as you have no output.

>> along with the full output of /proc/iomem
>
> The physical address is 0xf0100000
>
> $ cat /proc/iomem
> f0000000-fcffffff : PCI Bus 0000:00
>   f0000000-f7ffffff : PCI ECAM 0000 [bus 00-7f]
>     f0000000-f7ffffff : pnp 00:00

That's the memory mapped PCI config space and this tries to access:

   MMIO_START			0xf0000000
   BUSNUM	0x01 << 20      0x00100000
   SLOT/FN	0x00 << 12      0x00000000
   OFFSET	0x00 <<  0      0x00000000
                               -----------
                                0xf0100000

Offset 0 is vendor/device ID IIRC.

Anyway if that access does not complete because of a hardware issue,
then any subsequent access to the MMIO mapped HPET goes stale as well.

As the HPET is the active clocksource on your machine, this obviously
does not only affect the NMI watchdog readout, it affects the regular
timekeeper accesses too and all other MMIO accesses all over the place.

So gradually your machine just stalls on outstanding MMIO transactions
w/o further notice... The NMI is just a red herring.

You need to figure out why that MMIO access to that device's
configuration space stalls as anything else is just subsequent
damage.

There is not much what can be done about that unless the PCI bus raises
a failure interrupt and some magic reset sequence aborts the outstanding
stalled transactions.

Whether that's feasible or not, I don't know. The failure mechanism
might run into the same stall scenario when accessing the PCI muck for
reset...

Sorry for not being helpful.

Thanks,

        tglx