[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z1oY1qk-eWU8IcH3@slm.duckdns.org>
Date: Wed, 11 Dec 2024 12:57:26 -1000
From: Tejun Heo <tj@...nel.org>
To: Dave Chinner <david@...morbit.com>
Cc: linux-kernel@...r.kernel.org, linux-xfs@...r.kernel.org
Subject: Re: [6.13-rc0 regression] workqueue throwing cpu affinity warnings
during CPU hotplug
Hello, Dave.
Sorry about the really late reply.
On Fri, Nov 22, 2024 at 11:38:19AM +1100, Dave Chinner wrote:
> Hi Tejun,
>
> I just upgraded my test VMs from 6.12.0 to a current TOT kernel and
> I got several of these warnings whilst running fstests whilst
> running CPU hotplug online/offline concurrently with various tests:
>
> [ 2508.109594] ------------[ cut here ]------------
> [ 2508.115669] WARNING: CPU: 23 PID: 133 at kernel/kthread.c:76 kthread_set_per_cpu+0x33/0x50
...
> [ 2508.253909] <TASK>
> [ 2508.311972] unbind_worker+0x1b/0x70
> [ 2508.315444] workqueue_offline_cpu+0xd8/0x1f0
> [ 2508.319554] cpuhp_invoke_callback+0x13e/0x4f0
> [ 2508.328936] cpuhp_thread_fun+0xda/0x120
> [ 2508.332746] smpboot_thread_fn+0x132/0x1d0
> [ 2508.336645] kthread+0x147/0x170
> [ 2508.347646] ret_from_fork+0x3e/0x50
> [ 2508.353845] ret_from_fork_asm+0x1a/0x30
> [ 2508.357773] </TASK>
> [ 2508.357776] ---[ end trace 0000000000000000 ]---
So, this is kthread saying that the thread passed to it doesn't have
PF_KTHREAD set. There hasn't been any related changes and the flag is never
cleared once set, so I don't see how that could be for a kworker.
> I have also seen similar traces from the CPUs coming on-line:
>
> [ 2535.818771] WARNING: CPU: 23 PID: 133 at kernel/kthread.c:76 kthread_set_per_cpu+0x33/0x50
> ....
> [ 2535.969004] RIP: 0010:kthread_set_per_cpu+0x33/0x50
> ....
> [ 2508.249599] Call Trace:
> [ 2508.253909] <TASK>
> [ 2535.969029] workqueue_online_cpu+0xe6/0x2f0
> [ 2535.969032] cpuhp_invoke_callback+0x13e/0x4f0
> [ 2535.969044] cpuhp_thread_fun+0xda/0x120
> [ 2535.969047] smpboot_thread_fn+0x132/0x1d0
> [ 2535.969053] kthread+0x147/0x170
> [ 2535.969066] ret_from_fork+0x3e/0x50
> [ 2535.969076] ret_from_fork_asm+0x1a/0x30
> [ 2508.357773] </TASK>
Yeah, this is the same.
> I didn't see these on 6.12.0, so I'm guessing that there is
> something in the merge window that has started triggering this.
I tried a few mixtures of stress-ng + continuous hot [un]plugging but can't
reproduce in the current linus#master. Do you still see this happening?
Thanks.
--
tejun
Powered by blists - more mailing lists