[<prev] [next>] [day] [month] [year] [list]
Message-ID: <6ca2b58c-f689-447f-abc8-4e8dd9bf677a@afaics.de>
Date: Sun, 23 Jun 2024 13:39:41 +0200
From: Harald Dunkel <harri@...ics.de>
To: linux-kernel@...r.kernel.org
Subject: shutdown gets stuck on ancient Intel CPUs (>10 y)
Hi folks,
I've got quite a number of ancient systems that do not shut down
gracefully. On "reboot" they get stuck instead, with these messages
on the console (for example):
[3100272.289498] NMI watchdog: Watchdog detected hard LOCKUP on cpu 5
[3100274.206164] NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
[3100274.432631] NMI watchdog: Watchdog detected hard LOCKUP on cpu 12
[3100277.888319] NMI watchdog: Watchdog detected hard LOCKUP on cpu 11
[3100278.939282] NMI watchdog: Watchdog detected hard LOCKUP on cpu 2
[3100280.147329] rcu: INFO: rcu_preempt self-detected stall on CPU
[3100280.590580] rcu: 14-...!: (5248 ticks this GP) idle=afe4/1/0x4000000000000000 softirq=85687678/85687678 fqs=38
[3100280.713513] rcu: rcu_preempt kthread timer wakeup didn't happen for 5308 jiffies! g227828925 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[3100280.856203] rcu: Possible timer handling issue on cpu=3 timer-softirq=46376324
[3100280.945837] rcu: rcu_preempt kthread starved for 5367 jiffies! g227828925 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=3
[3100281.077083] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[3100281.188462] rcu: RCU grace-period kthread stack dump:
[3100281.251112] rcu: Stack dump where RCU GP kthread last ran:
[3100282.053462] NMI watchdog: Watchdog detected hard LOCKUP on cpu 10
[3100284.061698] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [kworker/8:0:2191850]
[3100284.065698] watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [migration/9:62]
[3100284.077698] watchdog: BUG: soft lockup - CPU#13 stuck for 22s! [migration/13:82]
[3100285.095828] NMI watchdog: Watchdog detected hard LOCKUP on cpu 7
[3100288.045764] watchdog: BUG: soft lockup - CPU#4 stuck for 23s! [etcd:1614]
[3100291.949829] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [etcd:556385]
[3100292.037830] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:21]
[3100292.081831] watchdog: BUG: soft lockup - CPU#15 stuck for 23s! [etcd:1613]
[3100293.997839] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
[3100308.078097] watchdog: BUG: soft lockup - CPU#14 stuck for 48s! [rasdaemon:2267297]
[3100312.062164] watchdog: BUG: soft lockup - CPU#8 stuck for 52s! [kworker/8:0:2191850]
[3100312.066164] watchdog: BUG: soft lockup - CPU#9 stuck for 48s! [migration/9:62]
[3100312.078164] watchdog: BUG: soft lockup - CPU#13 stuck for 48s! [migration/13:82]
[3100316.046230] watchdog: BUG: soft lockup - CPU#4 stuck for 49s! [etcd:1614]
[3100319.950295] watchdog: BUG: soft lockup - CPU#0 stuck for 48s! [etcd:556385]
[3100320.038297] watchdog: BUG: soft lockup - CPU#1 stuck for 49s! [migration/1:21]
[3100320.082297] watchdog: BUG: soft lockup - CPU#15 stuck for 49s! [etcd:1613]
These systems are >10 years old. The CPUs are Intel Xeon E5620 or similar. Kernel
is 6.1.90. An old NAS running with an Atom CPU D525 and kernel 6.7.5 is affected
as well.
The problem is quite reproducible (esp on the NAS), but it requires a little bit
of runtime. Immediately after startup the reboot works as expected. After a
few days it doesn't.
Every helpful comment is highly appreciated
Harri
Powered by blists - more mailing lists