[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZV4mbn3LLQHsKIGq@slm.duckdns.org>
Date: Wed, 22 Nov 2023 06:03:58 -1000
From: Tejun Heo <tj@...nel.org>
To: zhuangel570 <zhuangel570@...il.com>
Cc: jiangshanlai@...il.com, linux-kernel@...r.kernel.org,
Waiman Long <longman@...hat.com>
Subject: Re: [PATCH] workqueue: Make sure that wq_unbound_cpumask is never
empty
On Tue, Nov 21, 2023 at 11:39:36AM -1000, Tejun Heo wrote:
> During boot, depending on how the housekeeping and workqueue.unbound_cpus
> masks are set, wq_unbound_cpumask can end up empty. Since 8639ecebc9b1
> ("workqueue: Implement non-strict affinity scope for unbound workqueues"),
> this may end up feeding -1 as a CPU number into scheduler leading to oopses.
>
> BUG: unable to handle page fault for address: ffffffff8305e9c0
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> ...
> Call Trace:
> <TASK>
> select_idle_sibling+0x79/0xaf0
> select_task_rq_fair+0x1cb/0x7b0
> try_to_wake_up+0x29c/0x5c0
> wake_up_process+0x19/0x20
> kick_pool+0x5e/0xb0
> __queue_work+0x119/0x430
> queue_work_on+0x29/0x30
> ...
>
> An empty wq_unbound_cpumask is a clear misconfiguration and already
> disallowed once system is booted up. Let's warn on and ignore
> unbound_cpumask restrictions which lead to no unbound cpus. While at it,
> also remove now unncessary empty check on wq_unbound_cpumask in
> wq_select_unbound_cpu().
>
> Signed-off-by: Tejun Heo <tj@...nel.org>
> Reported-by: Yong He <alexyonghe@...cent.com>
> Link: http://lkml.kernel.org/r/20231120121623.119780-1-alexyonghe@tencent.com
> Fixes: 8639ecebc9b1 ("workqueue: Implement non-strict affinity scope for unbound workqueues")
> Cc: stable@...r.kernel.org # v6.6+
Applied to wq/for-6.7-fixes.
Thanks.
--
tejun
Powered by blists - more mailing lists