[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHk-=whgqmXgL_toAQWF793WuYMCNsBhvTW8B0xAD360eXX8-A@mail.gmail.com>
Date: Wed, 30 Jul 2025 20:31:44 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>, Tejun Heo <tj@...nel.org>,
Valentin Schneider <vschneid@...hat.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>
Subject: Re: [GIT PULL] Scheduler updates for v6.17
On Sun, 27 Jul 2025 at 23:48, Ingo Molnar <mingo@...nel.org> wrote:
>
> PSI:
>
> - Improve scalability by optimizing psi_group_change() cpu_clock() usage
> (Peter Zijlstra)
I suspect this is buggy.
Maybe this is coincidence, but that sounds very unlikely:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:3:7996]
CPU#0 Utilization every 4s during lockup:
#1: 100% system, 0% softirq, 0% hardirq,
0% idle
#2: 100% system, 1% softirq, 1% hardirq,
0% idle
#3: 100% system, 0% softirq, 0% hardirq,
0% idle
#4: 101% system, 0% softirq, 0% hardirq,
0% idle
#5: 100% system, 0% softirq, 0% hardirq,
0% idle
Modules linked in: uinput rfcomm nf_nat_tftp nf_conntrack_tftp
bridge stp llc ccm nf_conntrack_netbios_ns nf_conntrack_broadcast
nft_fib_inet [...]
CPU: 0 UID: 0 PID: 7996 Comm: kworker/0:3 Not tainted
6.16.0-06574-gd9104cec3e8f #164 VOLUNTARY
Hardware name: Dell Inc. XPS 13 9380/0KTW76, BIOS 1.26.0 09/11/2023
Workqueue: events psi_avgs_work
RIP: 0010:collect_percpu_times+0x2f6/0x320
Code: c0 0f b6 c0 c1 e0 09 41 09 c5 e9 14 ff ff ff 49 8b 0f 48 89 4c
24 48 49 8b 4f 08 48 89 4c 24 50 e9 6e fe ff ff 4c 89 c0 f3 90 <4a> 8b
14 ed c0 3c 20 93
RSP: 0018:ffffd4d3cc113d60 EFLAGS: 00000202
RAX: ffffffff93b26880 RBX: fffff4d3bfba0ed4 RCX: 000000000000622d
RDX: ffff8ced1e597880 RSI: fffffffc6684cefc RDI: 0000000000000000
RBP: ffffd4d3cc113db8 R08: ffffffff93b26880 R09: 0000000000000000
R10: 00001386e5a9adc7 R11: 000000000000eda9 R12: ffffd4d3cc113dd8
R13: 0000000000000006 R14: 0000000000000006 R15: fffff4d3bfba0ec0
FS: 0000000000000000(0000) GS:ffff8ced8a8f1000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000027f400c50010 CR3: 00000001b641e005 CR4: 00000000003726f0
Call Trace:
<TASK>
psi_avgs_work+0x31/0xa0
process_one_work+0x135/0x220
worker_thread+0x2e7/0x420
kthread+0xbd/0x1a0
ret_from_fork+0x133/0x160
ret_from_fork_asm+0x11/0x20
</TASK>
and yeah, the laptop was dead at that point. Thankfully it had been
alive enough that the watchdog messages made it into the logs.
There were more than one of those reports (34 of them to be exact) but
they all look pretty much the same. RIP is always the same at that
collect_percpu_times+0x2f6/0x320, but that's just the instruction
after the 'pause' instruction that is from
psi_read_begin ->
return read_seqcount_begin(per_cpu_ptr(&psi_seq, cpu));
which is from that __read_seqcount_begin() code that waits for the
writer to go away:
while (unlikely((__seq = seqprop_sequence(s)) & 1)) \
cpu_relax(); \
and clearly it never does.
Why? I have no idea. But hopefully this makes somebody go "D'oh!" and
send me a trivial fix.
Please?
Linus
Powered by blists - more mailing lists