[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240926132130.b1d1f6943d225368d3062d5e@linux-foundation.org>
Date: Thu, 26 Sep 2024 13:21:30 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, LKML
<linux-kernel@...r.kernel.org>, Hillf Danton <hdanton@...a.com>, Tejun Heo
<tj@...nel.org>, syzbot+943d34fa3cf2191e3068@...kaller.appspotmail.com
Subject: Re: [PATCH] kthread: Unpark only parked kthread
On Fri, 13 Sep 2024 23:46:34 +0200 Frederic Weisbecker <frederic@...nel.org> wrote:
> Calling into kthread unparking unconditionally is mostly harmless when
> the kthread is already unparked. The wake up is then simply ignored
> because the target is not in TASK_PARKED state.
>
> However if the kthread is per CPU, the wake up is preceded by a call
> to kthread_bind() which expects the task to be inactive and in
> TASK_PARKED state, which obviously isn't the case if it is unparked.
>
> As a result, calling kthread_stop() on an unparked per-cpu kthread
> triggers such a warning:
>
> WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
> <TASK>
> kthread_stop+0x17a/0x630 kernel/kthread.c:707
> destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
> wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
> netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
> default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
> ops_exit_list net/core/net_namespace.c:178 [inline]
> cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> process_one_work kernel/workqueue.c:3231 [inline]
> process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
> kthread+0x2f0/0x390 kernel/kthread.c:389
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> </TASK>
>
> Fix this with skipping unecessary unparking while stopping a kthread.
How does userspace trigger this? Is it an issue in current mainline?
Should we backport the fix into -stable kernels (depends on the answers
to the above questions).
It looks like the issue is old, so a Fixes: probably isn't needed. But
as the issue is old, why did it come to light now?
Powered by blists - more mailing lists